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HUMAN NEUROGENBS 3-ENCODING NUCLEOXmE SEQUENCES 

FIELD OF THE INVENTION 
The invention relates generally to the field of nucleotide sequences encoding 
5 transcription factors involved in growth and differentiation, particularly development of 
pancreatic islet ceils. 



BACKGROUND OF THE INVENTION 

Diabetes mellitus is the third leading cause of death in the U.S. and the leading cause 

10 of blindness, renal failure, and amputation. Diabetes is also a major cause of premature heart 
attacks and stroke and accounts for 15% of U.S. health care costs. Approximately 5% of 
Americans, and as many as 20% of those over the age of 65, have diabetes. 

Diabetes results from the failure of the P-ceils in the islets of Langerhans in the 
endocrine pancreas to produce adequate insulin to meet metabolic needs. Diabetes is 

15 categorized into two clinical forms: Type I diabetes (or insulin-dependent diabetes) and Type 
2 diabetes (or non-insulin-dependent diabetes). Type I diabetes is caused by the loss of the 
insulin-producmg p-celis. Type 2 diabetes is a more strongly genetic disease than Type 1 
(Zonana & Riraoin, 1976 N. Engl. J. Med. 295:603), usually has its onset alter in life, and 
accounts for approximately 90% of diabetes in the U.S. Affected individuals usually have 

20 both a decrease in the capacity of the pancreas to produce insulin and a defect in the ability to 
utilize the insuhn (insulin resistance). Obesity causes insulin resistance, and approximately 
80% of individuals with Type 2 diabetes are ciinicaily obese (greater than 20% above ideal 
body weight). Unfortunately, about one-half of the people in the U.S. affected by Type 2 
diabetes are unaware that they have the disease. Clinical symptoms associated with Type 2 

25 diabetes may not become obvious until late in the disease, and the early signs are often 

misdiagnosed, causing a delay in treatment and increased complications. While the role of 
genetics in the etiology of type 2 diabetes is clear, the precise genes involved are largely 
unknown. 

Insulin is made exclusively by the p-cells in the islets of Langerhans in the pancreas. 
30 During development, the islet cells, including the p-ceils, develop from an undifferentiated 
precursor within the growing pancreatic bud. As the bud grows, the undifferentiated cells 
form into ducts, and it is these cells that function as precursors. Duct cells appear to retain 
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the capacity to differentiate into isiet cells throughout life, and when the pancreas is damaged, 
new islet cells form from the duct cells. 

This developmental process is clinically relevant for several reasons. First, the 
formation of islet cells and especially P-cells is necessary in order to make insulin and control 

5 energy metabolism. If the process of P-cell development is in anyway impaired, it predisposes 
that individual to the later development of diabetes. Therefore genes involved in this process 
are candidate genes for neonatal diabetes, maturity onset diabetes of the young (MODY) or 
type 2 diabetes. The sequence of these genes could be used to identify individuals at risk for 
the development of diabetes, or to develop new pharmacological agents to prevent and treat 

10 diabetes. 

Second, as discussed above, insulin production is impaired in individuals with 
diabetes. In type 1 diabetes the impairment is caused by the destruction of the beta-cells» 
while in type 2 diabetes, insulin production is intact, but inadequate. Treatment of type 1 
diabetes, as well as many cases of type 2 diabetes, may involve replacement of the p-cells. 

15 While replacement of p-ceils may be accomplished in several ways, the development of new 
p-celis from precursor cells, either in culture or in vivo in the patient, would be the most 
physiologic. To do this, the molecules that control beta-ceil differentiation are needed. 

For these reasons, the diabetes field has spent considerable effort in attempts to 
identify islet precursor cells, and to develop methods for differentiating beta-cells in vitro. To 

20 date this has been largely unsuccessful. The present invention addresses this problem. 



Relevant Literature 

A cloned fragment of mouse Ngn3 is described in Sommer et al. 1996 Mol. Cell. 
Neurosci. 8:221. 

25 cDNA and amino acid sequences of murine Ngn3 and murine mammalian atonal 

homology 4B {MATH4B) are described at GenBank Accession Nos. U76208 and Y09167, 
respectively. 

cDNA and amino acid sequences of the rat relax transcriptional regulator are 
described at GenBank Accession No. Y 106 19. 
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SUMMARY OF THE INVENTION 
The present invention features a human neurogenin3 (Ngn3) polypeptide and 
nucleotide sequences encoding Ngn3 polypeptides. In a particular aspect, the polynucleotide 
is the nucleotide sequence of SEQ ID NO: 1 . in addition, the invention features isolated 
5 nucleic acid sequence comprising an Ngn3 promoter, as well as a polynucleotide sequences 
that hybridize under stringent conditions to SEQ ID NO: 1 . In related aspects the invention 
features expression vectors and host cells comprising polynucleotides that encode a human 
Ngn3 polypeptide. The present invention also relates to antibodies that bind specifically to a 
human Ngn3 polypeptide, methods for producing human Ngn3 polypeptides, methods for 
10 identifying p-cell precursor cells expressing Ngn3, methods for using the Ngn3 gene and the 
Ngn3 polypeptide to alter cellular differentiation in culture or in vivo to produce new P-cells 
to treat patients with diabetes mellitus, and identification of individuals at risk for diabetes by 
detecting alteration in Ngn3 coding and regulatory sequences and Ngn3 expression levels. 

A primary object of the invention is to provide an isolated human Ngn3 polypeptide- 
15 encoding polynucleotide for use in expression of human Ngn3 (e.g, in a recombinant host 
cell) and for use in, for example, identification of human Ngn3 polypeptide bindmg 
compounds (especially those compounds that affect human Ngn3 polypeptide-mediated 
activity, which compounds can be used to modulate Ngn3 activity). 

Another object of the invention is to provide an isolated human Ngn3 polypeptide- 
20 encoding polynucleotide for use in generation of non-human transgenic animal models for 
NgnS gene function, particularly "knock-in" Ngn3 non-human transgenic ammais 
characterized by excess or ectopic expression of the Ngn3 gene. 

These and other objects, advantages and features of the present invention will become 
apparent to those persons skilled in the art upon reading the details of the invention more fully 
25 set forth below. 

The invention will" now be described in further detail. 



DETAILED DESCRIPTION OF THE CVVXNTION 
Before the present nucleotide and polypeptide sequences are described, it is to be 
30 understood that this invention is not limited to the particular methodology, protocols, cell 

lines, vectors and reagents described as such may, of course, vary. It is also to be understood 
that the terminology used herein is for the purpose of describing particular embodiments only. 
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and is not intended to limit the scope of the present invention vvliich wiil be limited only by 
the appended claims. 

It must be noted that as used herein and in the appended claims, the singular forms 
"a", "and", and "the" include plural referents unless the context clearly dictates otherwise. 
5 Thus, for example, reference to "a host cell" includes a plurality of such host cells and 
reference to "the antibody" includes reference to one or more antibodies and equivalents 
thereof known to those skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meaning as commonly understood to one of ordinary skill in the art to which this invention 
10 belongs. Although any methods, devices and materials similar or equivalent to those 

described herein can be used in the practice or testing of the invention, the preferred methods, 
devices and materials are now described. 

Ail pubhcations mentioned herein are incorporated herein by reference for the purpose 
of describing and disclosing, for example, the cell lines, vectors, and methodologies which are 
1 5 described in the publications which might be used in connection with the presently described 
invention. The publications discussed herein are provided solely for their disclosure prior to 
the filing date of the present application. Nothing herein is to be construed as an admission 
that the inventors are not entitled to antedate such disclosure by virtue of prior invention. 

20 Definitions 

"Polynucleotide'' as used herein refers to an oliaonucieotide, nucleotide, and 
fragments or portions thereof, as well as to peptide nucleic acids (PNA), fragments, portions 
or antisense molecules thereof, and to DNA or RNA of genomic or synthetic origin which can 
be single- or double-stranded, and represent the sense or antisense strand. Where 

25 "polynucleotide" is used to refer to a specific polynucleotide sequence {e.g, a Ngn3 
polypeptide-encoding polynucleotide), "polynucleotide" is meant to encompass 
polynucleotides that encode a polypeptide that is functionally equivalent to the recited 
polypeptide, e.g., polynucleotides that are degenerate variants, or polynucleotides that encode 
biologically active variants or fragments of the recited polypeptide, including polynucleotides 

30 having substantia] sequence similarity or sequence identity relative to the sequences provided 
herein. Similariy, "polypeptide" as used herein refers to an oligopeptide, peptide, or protein. 
Where "polypeptide" is recited herein to refer to an amino acid sequence of a naturally- 
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occurring protein molecule, "poiypeptide" and like terms are not mcont to limit the amino acid 
sequence to the complete, native amino acid sequence associated with the recited protein 
molecule, but instead is meant to also encompass biologically active variants or fragments, 
including polypeptides having substantial sequence similarity or sequence identify relative to 
5 the amino acid sequences provided herein. 

By "antisense polynucleotide" is mean a polynucleotide having a nucleotide sequence 
complementary to a given polynucleotide sequence (e.g, a polynucleotide sequence encoding 
an Ngn3 polypeptide) including polynucleotide sequences associated with the transcription or 
translation of the given polynucleotide sequence (e.g.^ a promoter of a polynucleotide 

10 encoding an Ngn3 polypeptide), where the antisense polynucleotide is capable of hybridizing 
to an Ngn3 polypeptide-encoding polynucleotide sequence. Of particular interest are 
antisense polynucleotides capable of inhibiting transcription and/or translation of an Ngn3- 
encoding polynucleotide either in vitro or in vivo. 

"Peptide nucleic acid" as used herein refers to a molecule which comprises an 

15 oligomer to which an amino acid residue, such as lysine, and an amino group have been 

added. These small molecules, also designated anti-gene agents, stop transcript elongation by 
binding to their complementary (template) strand of nucleic acid (Mielsen et al 1993 
Anticancer Drug Des 8:53-63). 

As used herein, "Ngn3 polypeptide" refers to an amino acid sequence of a 

20 recombinant or nonrecombinant polypeptide having an amino acid sequence of i) a native 
Ngn3 polypeptide, li) a biologically active fragment of an Ngn3 polypeptide, iii) biologically 
active polypeptide analogs of an Mgn3 polypeptide, or iv) a biologically active variant of an 
Ngn3 polypeptide. Ngn3 polypeptides of the invention can be obtained from any species, 
e.g., mammalian or non-raammaiian {e.g., reptiles, amphibians, avian [e.g.,^ chicken)), 

25 particularly manmialian, including human, rodenti (e.g., murine or rat), bovine, ovine, porcine, 
murine, or equine, preferably rat or human, from any source whether natural, synthetic, • 
semi-synthetic or recombinant. "Human Ngn3 polypeptide" refers to the amino acid 
sequences of isolated human NgnS polypeptide obtained from a human, and is meant to 
include all naturally-occurring allelic variants, and is not meant to limit the amino acid 

30 sequence to the complete, native amino acid sequence associated with the recited protein 
molecule. 



-5- 



wo 00/59936 



PCT/USOO/08436 



As used herein, "antigenic amino acid sequence" means an amino acid sequence that, 
either alone or in association with a carrier molecule, can elicit an antibody response in a 
mammal. 

A 'Vanant" of a human Ngn3 polypeptide is defined as an amino acid sequence that is 
5 altered by one or more ammo acids. The variant can have ''conservative" changes, wherein a 

substituted amino acid has similar structural or chemical properties, e.g., replacement of 

leucine with isoleucine. More rarely, a variant can have "nonconservative" changes, e.g., 

replacement of a glycine with a tryptophan. Similar minor vanations can also include amino 

acid deletions or insertions, or both. Guidance in determining which and how many amino 
10 acid residues may be substituted, inserted or deleted without abolishing biological or 

immunological activity can be found using computer programs well known in the art, for 

example, DNAStar software. 

A "deletion" is defined as a change in either amino acid or nucleotide sequence in 

which one or more amino acid or nucleotide residues, respectively, are absent as compared to 
1 5 an amino acid sequence or nucleotide sequence of a naturally occurring NgnB polypeptide. 

An "insertion'' or "addition" is that change in an amino acid or nucleotide sequence 

which has resulted in the addition of one or more amino acid or nucleotide residues, 

respectively, as compared to an amino acid sequence or nucleotide sequence of a naturally 

occurring Ngn3 polypeptide. 
20 A "substitution" results from the replacement of one or more amino acids or 

nucleotides by different amino acids or nucleotides, respectively as compared to an amino 

acid sequence or nucleotide sequence of a naturally occurring Ngn3 polypeptide. 

The term "biologically active" refers to human Ngn3 polypeptide having structural, 

regulatory, or biochemical functions of a naturally occurring Ngn3 polypeptide. Lil^ewise, 
25 "immunologically active" defines the capability of the natural, recombinant or synthetic human 

Ngn3 polypeptide, or any oligopeptide thereof, to induce a specific immune response in 

appropriate animals or cells and to bind with specific antibodies. 

The term "deirivative" as used herein refers to the chemical modification of a nucleic 

acid encoding a human Ngn3 polypeptide or the encoded human NgnB polypeptide. 
30 Illustrative of such modifications would be replacement of hydrogen by an alkyl, acyl, or 

amino group. A nucleic acid derivative would encode a polypeptide which retains essential 

biological characteristics of a natural NgnB polypeptide. 
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As used herein the term "isolated'" is meant to describe a compound of interest (e.g., 
either a pol>Tiucleotide or a polypeptide) that is in an environment different from that in which 
the compound naturally occurs. 'Isolated" is meant to include compounds that are within 
samples that are substantially enriched for the compound of interest and/or in v/hich the 
5 compound of interest is partially or substantially purified. 

As used herein, the term ''substantially purified" refers to a compound (e.g., either a 
polynucleotide or a polypeptide) that is removed from its natural environment and is at least 
60% free, preferably 75% free, and most preferably 90% free from other components with 
which it is naturally associated. 
10 ''Stringency" typically occurs in a range from about Tm-S^'C (5°C below the Tm of the 

probe) to about 20°C to 25°C below Tm. As will be understood by those of skill in the art, a 
stringency hybridization can be used to identify or detect identical polynucleotide sequences 
or to identify or detect similar or related polynucleotide sequences. 

The term "hybridization" as used herein shall include "any process by which a strand 
1 5 of nucleic acid joins with a complementary strand through base pairing" (Coombs 1 994 

Dictionary of Biotechnology, Stockton Press, New York NY). Amplification as carried out 
in the polymerase chain reaction technologies is described in Dieffenbach et al. 1995, PGR 
Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview N\'. 

By "transformation" is meant a permanent or transient genetic change, preferably a 
20 permanent genetic change, induced in a cell following incorporation of new DNA (i.e., DNA 
exogenous to the cell). Genetic change can be accomplished either by incorporation of the 
new DNA into the genome of the host cell, or by transient or stable maintenance of the new 
DNA as an episomal element. Where the cell is a mammalian cell, a permanent genetic 
change is generally achieved by introduction of the DNA into the genome of the ceil. 
25 By "construct" is meant a recombinant nucleic acid, generally recombinant DNA, that 

has been generated for the purpose of the expression of a specific nucleotide sequence(s), or 
is to be used in the construction of other recombinant nucleotide sequences. 

By "operably linked" is meant that a DNA sequence and a regulatorv' sequence(s) are 
connected in such a way as to permit gene expression when the appropriate molecules (e.g., 
30 transcriptional activator proteins) are bound to the regulatory sequence(s). 

By "operativeiy inserted" is meant that a nucleotide sequence of interest is positioned 
adjacent a nucleotide sequence that directs transcription and translation of the introduced 
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nucleotide sequence of interest {i.e., facilitates the production of, e.g., a polypeptide encoded 
by an Mgn3 sequence). 

By "Ngn3 associated disorder" is meant a physiological condition or disease 
associated with altered Ngn3 function (e.g., due to aberrant Ngn3 expression or a defect in 
5 Ngn3 expression or in the Mgn3 protein). Such Ngn3 associated disorders can include, but 
are not necessarily limited to, disorders associated with reduced levels of insulin or the ability 
to utilize insulin (e.g., hyperglycemia, diabetes (e.g., Type 1 and Type 2 diabetes, and the 
like). 

By "subject" or "patient" is meant any mammalian subject for whom diagnosis or 
10 therapy is desired, particularly humans. Other subjects may include cattle, dogs, cats, guinea 
pigs, rabbits, rats, mice, horses, and so on. Of particular interest are subjects having an 
Ngn3 -associated disorder that is amenable to treatment {e.g., to mitigate symptoms 
associated with the disorder) by expression of either Ngn3 -encoding nucleic acid in a cell of 
the subject (e.g., by introduction of the Ngn3-encoding nucleic acid into the subject in vivo, 
15 or by implanting Ngn3-expressing cells {e.g. , P-cell precursors) or nearly developed or mature 
p-cells cultured from Ngn3-expressing cells into the subject, which cells produce insuhn). 

The term "transgene" is used herein to describe genetic material which has been or is 
about to be artificially inserted into the genome of a mammalian, particularly a mammalian 
cell of a living animal. 

20 By "transgenic organism" is meant a non-human organism (e.g., single-cell organisms 

(e.g., yeast), mammal, non-mammal (e.g., nematode or Drosophila)) having a non- 
endogenous (i.e., heterologous) nucleic acid sequence present as an extrachromosomai 
element in a portion of its cells or stably integrated into its germ line DMA. 

By "transgenic animal" is meant a non-human animal, usually a mammal, having a 

25 non-endogenous (i.e., heterologous) nucleic acid sequence present as an extrachromosomai 
element in a portion of its cells or stably integrated into its germ line DNA (i.e., in the 
genomic sequence of most or all of its cells). Heterologous nucleic acid is introduced into the 
germ line of such transgenic animals by genetic manipulation of, for example, embryos or 
embryonic stem ceils of the host animal. 

30 A *'knock-out" of a target gene means an alteration in the sequence of the gene that 

results in a decrease of function of the target gene, preferably such that target gene expression 
is undetectable or insigmficant. A knock-out of an Ngn3 gene means that function of the 
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Ngn3 gene has been substantially decreased so that Ngn3 expression is not detectable or only 
present at insignificant levels. "Knock-out" transgenics of the invention can be transgenic 
animais having a heterozygous knock-out of the Ngn3 gene or a homozygous knock-out of 
the Ngn3 gene. "Knock-outs" also include conditional knock-outs, where alteration of the 
S target gene can occur upon, for example, exposure of the animal to a substance that promotes 
target gene alteration, introduction of an enzyme that promotes recombination at the target 
gene site (e.g., Cre in the Cre-lox system), or other method for directing the target gene 
alteration postnataJly. 

A "knock-in" of a target gene means an alteration in a host cell genome that results in 

10 altered expression (e.g., increased (including ectopic) or decreased expression) of the target 
gene, e.g., by introduction of an additional copy of the target gene, or by operatively inserting 
a regulatory sequence that proNodes for enhanced expression of an endogenous copy of the 
target gene. "Knock-in" transgenics of the invention can be transgenic animals having a 
heterozygous knock-in of the Ngn3 gene or a homozygous knock-in of the Ngn3 gene. 

15 "Knock-ins" also encompass conditional knock-ins. 

Overview of the Invention 

The present invention is based upon the identification and isolation of a polynucleotide 
sequence encoding a human neurogenin3 (Ngn3) polypeptide, as well as the human and 

20 murine Ngn3 promoters. Accordingly, the present invention encompasses such human Ngn3 
polypeptide-encoding polynucleotides, as well as human Ngn3 polypeptides encoded by such 
polynucleotides. Expression of Ngn3 is linked to pancreatic development. Specifically, NgnS 
expression is the earliest available marker of cells that will develop into islet cells. Because 
Ngn3 expression is extinguished before the cells are completely differentiated, Ngn3 uniquely 

25 marks precursor cells. The proof that these are islet cell precursors is based on three pieces 
of evidence: 1) Expression pattern. Ngn3 cells are seen scattered tlirough the pancreatic 
duct cells, with a smaller number present adjacent to the ducts. 2) Timing. The appearance 
of the Ngn3-positive cells parallels the formation of new islet cells during development. 3) 
Ngn3-positive cells co-express other endocrine transcription factors, including the P-cell 

30 transcription factor Nkx-6. 1 . Nkx6. 1 is known to be expressed in p-cells and P-cell 

precursors at this stage of pancreatic development, and the knock-out of the Nkx-6. 1 gene in 
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mice causes a specific defect in P-cell development, but no defect in the formation of other 
pancreatic cells (see, e.g., WO 99/05258). 

The invention also encompasses the use of the polynucleotides disclosed herein to 
facilitate identification and isolation of polynucleotide and polypeptide sequences having 
5 homology to a human Ngn3 polypeptide of the invention. The human Ngn3 polypeptides and 
polynucleotides of the invention are also useiul in the identification of human Ngn3 
polypeptide-binding compounds, particularly human Ngn3 polypeptide-binding compounds 
having human Ngn3 polypeptide agonist or antagomst activity. In addition, the human Ngn3 
polypeptides and polynucleotides of the invention are useful in the diagnosis, prevention and 

10 treatment of disease associated with human Ngn3 polypeptide biological activity. 

The human Ngn3 polypeptide-encoding polynucleotides of the invention can also be 
used in the development of p-cells in culture and in vivo, as a molecular probe with which to 
determine the structure, location, and expression of the human Ngn3 polypeptide and related 
polypeptides in manamals (including humans), and to investigate potential associations 

15 between disease states or clinical disorders and defects or alterations in human Ngn3 
polypeptide structure, expression, or function. 

Ngn3 Nucleic Acid 

The term "Ngn3 gene" is used genericaily to designate Ngn3 genes and their alternate 
20 fortns. "Ngn3 gene" is also intended to mean the open readmg frame encoding specific Ngn3 
polypeptides, introns, and adjacent 5' and 3' non-coding nucleotide sequences involved in the 
regulation of expression, up to about I kb beyond the coding region, but possibly fiirther in 
either direction. The DNA sequences encoding Ngn3 may be cDNA or genomic DNA or a 
fi*agment thereof The gene may be introduced into an appropriate vector for 
25 extrachromosomai maintenance or for inteeration into the host. 

The term ''cDNA" as used herein is intended to include all nucleic acids that share the 
arrangement of sequence elements found in native mature mRMA species, where sequence 
elements are exons (e.g., sequences encoding open reading frames of the encoded 
polypeptide) and 3' and 5' non-coding regions. Normally mRNA species have contiguous 
30 exons, with the intervening introns removed by nuclear RNA splicing, to create a continuous 
open reading frame encoding the Ngn3 polypeptide. 
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While other genomic Ngn3 sequences of other sources may have non-contiguous 
open reading frames (e.g., where introns interrupt the protein coding regions), the human 
genomic Ngn3 sequence has no introns interrupting the coding sequence. A genomic 
sequence of interest comprises the nucleic acid present between the initiation codon and the 
5 stop codon, as defined in the listed sequences, including ail of the introns that are normally 
present in a native chromosome. It may lliaher include the 3 ' and 5' untranslated regions 
found in the mature raRNA. It may further include specific transcriptional and translational 
regulatory sequences, such as promoters, enhancers, etc., including about 1 kb, but possibly 
more, of flanking genomic DNA at either the 5' or 3' end of the transcribed region. The 

1 0 genomic DNA may be isolated as a fragment of 1 00 kbp or smaller; and substantially free of 
flanking chromosomal sequence. 

The sequence of this 5' region, and further 5' upstream sequences and 3' downstream 
sequences, may be utilized for promoter elements, including enhancer binding sites, that 
provide for expression in tissues where Ngn3 is expressed. The sequences of the Ngn3 

15 promoter elements of the invention can be based on the nucleotide sequences of any species 
{e.g., mammalian or non-mammaJian (e.g., reptiles, amphibians, avian {e.g., chicken)), 
particularly mammalian, including human, rodenti (e.g., murme or rat), bovine, ovine, porcine, 
murine, or equine, preferably rat or human) and can be isolated or produced from any source 
whether natural, synthetic, semi-synthetic or recombinant. 

20 The tissue specific expression of Ngn3 is useful for determining the pattern of 

expression, and for providing promoters that mimic the native pattern of expression. 
Naturally occurring polymorphisms in the promoter region are useful for determining natural 
variations in expression, particularly those that may be associated with disease. Alternatively, 
mutations may be introduced into the promoter region to determine the effect of ahering 

25 expression in experimentally defined systems. Methods for the identification of specific DNA 
motifs involved in the binding of transcriptional factors are known in the an, e.g. sequence 
similarity to known binding motifs, gel retardation studies, eic. For examples, see Blackwell 
et al. 1995 Mol Med 1 : 194-205; Monlock et al. 1996 Genome Res. 6: 327-33; and Jouiin and 
Richard-Foy (1995) Eur J Biochem 232: 620-626. 

30 In one embodiment, the Ngn3 promoter is used to direct expression of genes to islet 

cell precursors. As discussed below, Ngn3 is expressed in islet cell precursors during 
development of |3-cells. Thus, the deveiopmentally timed expression directed by the Ngn3 
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wo 00/59936 



PCT/USOO/08436 



promoter can be exploited to facilitate expression of heterologous genes operably linked to 
the Nga3 promoter. Exemplary genes of interest that can be expressed from the Ngn3 
promoter include, but are not necessarily limited to, genes encoding growth factors or 
onocogenes (e.g., to expand and/or immortalize the P-cell progenitor population), marker 
5 genes (e.g., for marking the precursor cells for selection and/or tracing), reporter genes (e.g., 
luciferase> CAT, e/c, for, e.g., identifying mechanisms for regulating the Ngn3 promoter 
and/or to search for bioactive agents (e.g., candidate pharmaceutical agents) that regulate the 
promoter), and the like. 

The regulatory sequences may be used to identify cis acting sequences required for 

10 transcriptional or translational regulation of Ngn3 expression, especially in different tissues or 
stages of development, and to identify cis acting sequences and trans acting factors that 
regulate or mediate Ngn3 expression. Such transcriptional or translational control regions 
may be operably linked to an Ngn3 gene or other genes in order to promote expression of 
wild type or altered Ngn3 or other proteins of interest in cultured cells, or in embryonic, fetal 

15 or adult tissues, and for gene therapy. Ngn3 transcriptional or translational control regions 
can also be used to identify extracellular signal molecules that regulate Ngn3 promoter 
activity, and thus regulate Ngn3 expression and islet ceil formation. 

The nucleic acid compositions used in the subject invention may encode all or a part 
of the Ngn3 polypeptides as appropriate. Fragments may be obtained of the DNA sequence 

20 by chemically synthesizing oligonucleotides in accordance with conventional methods, by 
restriction enzyme digestion, by PGR amplification, etc. For the most part, DNA fragments 
will be of at least about ten contiguous nucleotides, usually at least about 15 nt, more usually 
at least about 18 nt to about 20 nt, more usually at least about 25 nt to about 50 nt. Such 
small DNA fragments are useful as primers for PCR, hybridization screening, etc. Larger 

25 DNA fragments, i.e. greater than 100 nt are useful for production of the encoded polypeptide. 
For use in amplification reactions, such as PCR, a pair of primers will be used. The exact 
composition of the primer sequences is not critical to the invention, but for most applications 
the primers will hybridize to the subject sequence under stringent conditions, as known in the 
art. It is preferable to choose a pair of primers that will generate an amplification product of 

30 at least about 50 nt, preferably at least about 100 nt. Algorithms for the selection of primer 
sequences are generally known, and are available in commercial software packages. 
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Amplification primers hybridize to coraplemeatary strands of DNA, and will prime towards 
each other. 

The Ngn3 gene is isolated and obtained in substantial purity, generally as other than 
an intact mammalian chromosome. Usually, the DNA will be obtained substantially free of 

5 other nucleic acid sequences that do not include an Ngn3 sequence or fragment thereof, 
generally being at least about 50%, usually at least about 90% pure and are typically 
"recombinant", i.e. flanked by one or more nucleotides with which it is not normally 
associated on a naturally occurring chromosome. 

The DNA sequences are used in a variety of ways. They may be used as probes for 

10 identifying homologs of Ngn3. Mammalian homologs have substantial sequence similarity to 
one another, i.e. at least 75%, usually at least 90%, more usually at least 95% sequence 
identity. Sequence similarity and sequence identity are calculated based on a reference 
sequence, which may be a subset of a larger sequence, such as a conser/ed motif, coding 
region, flanking region, etc. A reference sequence will usually be at least about ! 8 nt long, 

15 more usually at least about 30 nt long, and may extend to the complete sequence that is being 
compared. Algorithms for sequence analysis are known in the art, such as BLAST, described 
in Altschul et al. 1990 J Mol Biol 215 :403-10. For the purposes of the present application, 
percent identity for the polynucleotides of the invention is determined using the BLASTN 
program with the default settings as described at http://www.ncbi.nim.nih.gov/ cgi-bin/ 

20 BLAST/nph-newblast?Jforra=0 with the DUST filter selected. The DUST fiher is described 
at http://www.ncbi.nlm.mh.gov/ BLAST/filtered.html. 

Nucleic acids havmg sequence similarity are delected by hybridization under low 
stringency conditions, for example, at 50^C and 6XSSC (0.9 M saliney0.09 M sodium citrate) 
and remain bound when subjected to washing at 55°C in IXSSC (0. 1 5 M sodium 

25 chloride/0.01 5 M sodium citrate). Sequence identity may be determined by hybridization 
under high stringency conditions, for example, at 50°C or higher and 0. IXSSC (15 mM 
saline/0.15 mM sodium citrate). By using probes, particularly labeled probes of DNA 
sequences, one can isolate homologous or related genes. The source of homologous genes 
may be any species, e.g. primate species, particularly human; rodents, such as rats and mice, 

30 canines, felines, bovines, ovines, equines, yeast, Drosophila, Caenhorabditis, etc. 

The Ngn3-encoding DNA may be used to identify expression of the gene in a 
biological specimen. The manner in which one probes cells for the presence of particular 
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nucleotide sequences, as genomic DNA or RNA, is well established in the literature and does 
not require elaboration here. mRNA is isolated from a ceil sample. mRNA may be amplified 
by RT-PCR^ usmg reverse transcriptase to form a complementary DNA strand, followed by 
polymerase chain reaction amplification using primers specific for the subject DMA 
5 sequences. Alternatively, mRNA sample is separated by gel electrophoresis, transferred to a 
suitable support, e.g, nitrocellulose, nylon, etc., and then probed with a fragment of the 
subject DNA as a probe. Other lecimiques, such as oligonucleotide ligation assays, in sUu 
hybridizations, and hybridization to DNA probes arrayed on a solid chip may also find use. 
Detection of mRNA hybridizing to an Ngn3 sequence is indicative of Ngn3 gene expression 
10 in the sample. 

The Ngn3 nucleic acid sequence may be modified for a number of purposes, 
particularly where they will be used intraceilularly, for example, by being joined to a nucleic 
acid cleaving agent, e.g. a chelated metal ion, such as iron or chromium for cleavage of the 
gene; or the like. 

15 The sequence of the Ngn3 locus, including flanking promoter regions and coding 

regions, may be mutated in various ways known in the art to generate targeted changes in 
promoter strength, sequence of the encoded protein, etc. The DNA sequence or product of 
such a mutation will be substantially similar to the sequences provided herein, i.e. will differ 
by at least one nucleotide or amino acid, respectively, and may differ by at least two but not 

20 more than about ten nucleotides or amino acids. The sequence changes may be substitutions, 
insertions or deletions. Deletions may further include larger changes, such as deletions of a 
domain or exon. Other modifications of interest include epitope tagging, e.g. with the FLAG 
system, HA, eic. For studies of subcellular localization, fusion proteins with green 
fluorescent proteins (GFP) may be used. Such mutated genes may be used to study structure- 

25 function relationships of Ngn3 polypeptides with other polypeptides {e.g., Nkx*6. 1 , which is 
co-expressed with Ngn3), or to alter properties of the proteins that affect their function or 
regulation. Such modified Ngn3 sequences can be used to, for example, generate the 
transgenic animals. 

Techniques for in vitro mutagenesis of cloned genes are known. Examples of 
30 protocols for scanmng mutations may be found in Gustin et al., 1993 Biotechniques 14:22 ; 
Barany, 1985 Gene 37:1 11-23; Colicelli et al.. 1985 Mol Gen Genet 199:537-9: and Prentki 
et al., 1984 Gene 29:303-13. Methods for site specific mutagenesis can be found in 
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Sambrook et a!., 1989 Molecular Cloning: A Laboratory Manual, CSH Press, pp. 15.3- 
15.108; Weiner et al.. 1993 Gene 126:35-41; Sayers et al., 1992 Biotechniques 13:592-6; 
Jones and Winistorfer, 1992 Biotechniques 12:528-30; Barton et al., 1990 Nucleic Acids Res 
18:7349-55; Marotti and Tomich, 1989 Gene Anal Tech 6:67-70; and Zhu 1989 Anal 
5 Biochem 177:120-4. 

Ngn3 Transgenjc Animals 

The Ngn3-encoding nucleic acids can be used to generate genetically modified 
non-human animals or site specific gene modifications in cell lines. The term "transgenic" is 

1 0 intended to encompass genetically modified animals having a deletion or other knock-out of 
Ngn3 gene activity, having an exogenous Ngn3 gene that is stably transmitted in the host 
cells, knock-in" having altered Ngn3 gene expression, or having an exogenous Ngn3 
promoter operably linked to a reporter gene. Of particular interest are homozygous and 
heterozygous knock-outs of Ngn3. 

1 5 Transgenic animals may be made through homologous recombination, where the 

Ngn3 locus is altered. Alternatively, a nucleic acid construct is randomly integrated into the 
genome. Vectors for stable integration include plasmids, retroviruses and other animal 
viruses, YACs, and the like. Of interest are transgenic mammals, preferably a mammal from a 
genus selected from the group consisting of Mus (e.g., mice), Rattus (e.g., rats), Oryctologus 

20 (e.g., rabbits) and Mesocricetus (e.g., hamsters). More preferably the animal is a mouse 
which is defective or contains some other alteration in Ngn3 gene expression or function. 
Without being held to theory, Ngn3 is a transcription factor that is expressed in islet cell 
precursors during pancreatic development, transgenic animals having altered Ngn3 gene 
expression will be useful models of pancreatic development. 

25 A "knock-out" ammal is genetically manipulated to substantially reduce, or eliminate 

endogenous Ngn3 function, preferably such that target gene expression is undetectable or 
insignificant. Different approaches may be used to achieve the "knock-out". A chromosomal 
deletion of ail or part of the native Ngn3 homolog may be induced. Deletions of the non- 
coding regions, particularly the promoter region, 3' regulatory sequences, enhancers, or 

30 deletions of gene that activate expression of the Ngn3 genes. A functional knock-out may 
also be achieved by the introduction of an anti-sense construct that blocks expression of the 
native Ngn3 gene (for example, see Li and Cohen (1996) Cell 85:3 19-329). 
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Conditional knock-outs of Ngn3 gene function can also be generated. Conditional 
knock-outs are transgenic animals that exhibit a defect in Ngn3 gene function upon exposure 
of the animal to a substance that promotes target gene alteration, introduction of an enzyme 
that promotes recombination at the target gene site (e.g., Cre in the Cre-ioxP system), or 
5 other method for directing the target gene alteration. 

For example, a transgenic animal having a conditional knock-out of !Ngn3 gene 
function can be produced using the Cre-loxP recombination system (see, e.g., Kilby et al. 
1993 Trends Genet 9:413-421). Cre is an enzyme that excises the DNA between two 
recognition sequences, termed loxP. This system can be used in a variety of v/ays to create 

10 conditional knock-outs of Ngn3. For example, two independent transgenic mice can be 
produced: one transgenic for an Ngn3. sequence flanked by loxP sites and a second 
transeenic for Cre. The Cre transeene can be under the control of an inducible or 
developmentally regulated promoter (Gu et al. 1993 Cell 73: 1 155-1 164, Gu et al. 1994 
Science 265:103-1 06), or under control of a tissue-specific or cell type-specific promoter 

15 (e.g., a pancreas-specific promoter or brain tissue-specific promoter). The Ngn3 transgenic is 
then crossed with the Cre transgenic to produce progeny deficient for the Ngn3 gene only in 
those cells that expressed Cre during development. 

Transgenic animals may be made having an exogenous Ngn3 gene. For example, the 
transgenic animal may comprise a "knock-in" of an Ngn3 gene, such that the host cell genome 

20 contains an alteration that results in altered expression (e.g., increased (including ectopic) or 
decreased expression) of an Ngn3 gene, e.g., by introduction of an additional copy of the 
target gene, or by operatively inserting a regulatory sequence that provides for enhanced 
expression of an endogenous copy of the target gene. ''Knock-in" transgenics can be 
transgenic animals having a heterozygous knock-in of the Ngn3 gene or a homozygous 

25 knock-in of the Ngn3. "Knock-ms" also encompass conditional knock-ins. 

The exogenous gene introduced into the host cell genome to produce a transgenic 
animal is usually either from a different species than the animal host, or is othenvise altered in 
its coding or non-coding sequence. The introduced gene may be a wild-type gene, naturally 
occurring polymorphism, or a genetically manipulated sequence, for example those previously 

30 described with deletions, substitutions or insertions in the coding or non-coding regions. The 
introduced sequence may encode an Ngn3 polypeptide, or may utilize the Ngn3 promoter 
operabiy linked to a reporter gene. Where the introduced gene is a coding sequence, it is 
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usually operably linked to a promoter, which may be constitutive or iaducibie, and other 
regulatory sequences required for expression in the host animal. 

Specific constructs of interest include, but are not limited to, anti-sense Ngn3, or a 
ribozyme based on an Ngn3 sequence, which will block Ngn3 expression, as well as 
5 expression of dominant negative Ngn3 mutations, and over-expression of an Ngn3 gene. A 
detectable marker, such as lac 2 may be introduced into the Ngn3 locus, where upregulation 
of expression of the corresponding Ngn gene will result in an easily detected change in 
phenotype. Constructs utilizing a promoter region of the Ngn3 genes in combination with a 
reporter gene or with the coding region of Ngn3 are also of interest. Constructs having a 

10 sequence encoding a truncated or altered (e.g, mutated) Ngn3 are also of interest. 

The modified cells or animals are useful in the study of function and regulation of 
Ngn3 and other proteins involved the pancreatic p-cell developmental pathway. Such 
modified cells or animals are also useful in, for example, the study of the fijnction and 
regulation of genes whose expression is affected by Ngn3, as well as the study of the 

15 development of insulin-secreting cells in the pancreas. Thus, the transgenic animals of the 
invention are useful in identifying downstream targets of Ngn3, as such targets may have a 
role in the phenotypes associated with defects in'Ngn3. 

Animals may also be used in functional studies, drug screening, etc., e.g, to determine 
the effect of a candidate drug on islet cell development, on j3-cell function and development 

20 or on symptoms associated with disease or conditions associated with Ngn3 defects (e.g., on 
symptoms associated with reduced insulin secretion (e.g., such as that associated with a 
diabetic syndrome, including Type 2 diabetes). A series of small deletions and/or 
substitutions may be made in the Ngn3 genes to determine the role of different poiypeptide- 
encoding regions in DNA binding, transcriptional regulation, etc. By providing expression of 

25 Ngn3 protein in cells in which it is otherwise not normally produced (e.g., ectopic 

expression), one can induce changes in ceil behavior. These animals are also useful for 
exploring models of inheritance of disorders associated with diabetes, e.g. dominant v. 
recessive; relative effects of different alleles and synergistic effects between Ngn3 and other 
genes elsewhere in the genome. 

30 DNA constructs for homologous recombination will comprise at least a portion of the 

Ngn3 gene with the desired genetic modification, and will include regions of homology to the 
target locus. DNA constructs for random integration need not include regions of homology 
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to mediate recombination. Conveniently, markers for positive and negative selection are 
included. Methods for generating cells having targeted gene modifications through 
homologous recombination are known in the art. For various techniques for transfecting 
mammalian cells, see Keown et al. 1990 Methods in Enzymology 185:527-537. 
5 For embryonic stem (ES) cells, an ES cell line may be employed, or embr^'onic cells 

may be obtained fresWy from a host, e,g. mouse, rat, guinea pig, e(c. Such cells are grown on 
an appropriate fibroblast-feeder layer or grow in the presence of appropriate growth factors, 
such as leukemia inhibiting factor (LIF). When ES cells have been transformed, they may be 
used to produce transgenic animals. After transformation, the cells are plated onto a feeder 

10 layer in an appropriate medium. Cells containing the construct may be detected by employing 
a selective medium. After sufficient time for colonies to grow, tliey are picked and analyzed 
for the occurrence of homologous recombination or integraiion of the construct. Those 
colonies that are positive may then be used for embryo manipulation and blastocyst injection. 
Blastocysts are obtained from 4 to 6 week old superovulated females. The ES cells are 

15 trypsinized, and the modified cells are injected into the blastocoel of the blastocyst. After 
injection, the blastocysts are returned to each uterine horn of pseudopregnant females. 
Females are then allowed to go to term and the resulting litters screened for mutant cells 
having the construct. By providing for a different phenotype of the blastocyst and the ES 
cells, chimeric progeny can be readily detected. 

20 The chimeric aiumals are screened for the presence of the modified gene. Cliimeric 

animals having the modification (normally chimeric males) are mated with waldtype animals to 
produce heterozygotes, and the heterozygotes mated to produce homozygotes. If the gene 
alterations cause lethality at some point in development, tissues or organs can be maintained 
as allogeneic or congemc grafts or transplants, or in /// vitro culture. 

25 Investigation of genetic function may utilize non-mammalian models, paniculariy 

using those organisms that are biologically and genetically well-characterized, such as 
C. elegans, D, melanogasier and S. cerevisiae. For example, transposon (Tel) insertions in 
the nematode homolog of an Ngn3 gene or a promoter region of an Ngn3 gene may be made. 
The Ngn3 gene sequences may be used to knock-out or to complement defined genetic 

30 lesions in order to determine the physiological and biochemical pathways involved in function 
of islet cells. It is well known that human genes can complement mutations in lower 
eukaryotic models. 
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Production of Ngn3 Polypeptides 

Ngn3 -encoding nucleic acid may be employed to synthesize flili-Iength Ngn3 
polypeptides or fragments thereof, particularly fragments corresponding to functional 
domains; DNA binding sites; e/c; and including fusions of the subject polypeptides to other 
5 proteins or parts thereof For expression, an expression cassette may be employed, providing 
for a transcriptional and translational initiation region, which may be inducible or constitutive, 
where the coding region is operably linked under the transcriptional control of the 
transcriptional initiation region, and a transcriptional and translational termination region. 
Various transcriptional initiation regions may be employed that are functional in the 

10 expression host. 

As discussed above, the invention encompasses both isolated, naturally-occurring 
Ngn3 polypeptides, as well as recombinant Ngn3 polypeptides and functional equivalents of 
such recombinant and/or naturally-occurring Ngn3 polypeptides, e.g., biologically active 
variants sharing substantial or significant amino acid sequence similarity and/or sequence 

1 5 identity with an Ngn3 amino acid sequence provided herein. Substantial identity, when 
referring to the Ngn3 polypeptides of the invention are polypeptides having at least about 
70%, typically at least about 80% and preferably at least about 90% to about 95% identity to 
the amino acid sequence of SEQ ID NO: 2, or that are encoded by polynucleotides which will 
hybridize under stringent conditions to a polynucleotide having the nucleotide sequence of 

20 SEQ ED NO: 1 or SEQ ID N0:3.. Percent identity for the polypeptides of the invention is 
determined using the BLASTP proeram with the default seitinas as described at 
http://www.ncbi.nlra.nih.gov/ cgi-bin/BLAST/nph-newblast?Jform=0 with the DUST filter 
selected. The DUST filter is described at http://www.ncbi.nlm.nih.gov /BLAST/fiitered.html. 
Accordingly, the Ngn3 polynucleotides and polypeptides of this invention include, 

25 without limitation, Ngn3 polypeptides and polynucleotides found in primates, rodents, 
canines, felines, equines, nematodes, yeast and the like; and the natural and non-natural 
variants thereof 

The polypeptides may be expressed in prokaryotes or eukarv'otes in accordance with 
conventional ways, depending upon the purpose for expression. For large scale production of 
30 the protein, a unicellular organism, such as E. coli, B. siibiiUs, S. cerevisiae, or cells of a 

higher organism such as veaebrates, particularly mammals, e.g. COS 7 ceils, may be used as 
the expression host cells. In many situations, it may be desirable to express the Ngn3 genes in 
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mammaiian cells, especially where the encoded polypeptides will benefit from native folding 
and post-translational modifications. Small peptides can also be synthesized in the laboratory. 

With the availability of the polypeptides in large amounts, by employing an expression 
host, the polypeptides may be isolated and purified in accordance with conventional ways. A 
5 lysate may be prepared of the expression host and the lysate purified using HPLC, exclusion 
chromatography, gel electrophoresis, affinity chromatography, or other purification 
technique. The purified polypeptide will generally be at least about 80% pure, preferably at 
least about 90% pure, and may be up to and including 100% pure. Pure is intended to mean 
fi'ee of other proteins, as well as cellular debris. 

10 The Ngn3 polypeptides can be used for the production of antibodies, where short 

fragments provide for antibodies specific for the particular polypeptide, and larger fragments 
or the entire protein allow for the production of antibodies over the surface of the 
polypeptide. Antibodies may be raised to the wild-type or variant forms of Ngn3. Antibodies 
may be raised to isolated peptides corresponding to these domains, or to the native protein, 

15 e.g. by immunization with cells expressing Ngn3, immunization with liposomes having Ngn3 
polypeptides inserted in the membrane, elc. 

Antibodies are prepared in accordance with conventional ways, where the expressed 
polypeptide or protein is used as animmunogen, by itself or conjugated to known 
immunogenic carriers, e.g. KLH, pre-S HBsAg, other viral or eukaryotic proteins, or the like. 

20 Various adjuvants may be employed, with a series of injections, as appropriate. For 
monoclonal antibodies, after one or more booster injections, the spleen is isolated, the 
lymphocytes immortalized by cell fusion, and then screened for high affinity antibody binding. 
The immortalized cells, Le. hybridomas, producmg the desired antibodies may then be 
expanded. For further description, see Monoclonal Antibodies: A Laboratory Manual, 

25 Harlow and Lane eds., Cold Spring Harbor Laboratories, Cold Spring Harbor, New York, 
1988. If desired, the mRNA encoding the heavy and light chains may be isolated and • 
rautagenized by cloning in E. coli, and the heavy and light chains mixed to fijrther enhance the 
affinity of the antibody. Alternatives to in vivo immunization as a method of raising 
antibodies include binding to phage "display" libraries, usually in conjunction with in vitro 

30 affinity maturation. 
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Isolation of Ngii3 AJIelic Variants and Homoiogues in Other Species 

Other mammalian Ngn3 genes can be identified and their function characterized using 
the Ngrj3 genes used in the present invention. Other Ngn3 genes of interest include, but are 
not limited to, mammalian {e.g., human, rodent (e.g, murine, or rat), bovine, felitie, canine, 
5 and the like) and non-mammalian (e.g., chicken, reptile, and the hke). Methods for 
identifying, isolating, sequencing, and characterizing an unknown gene based upon its 
homology to a known gene sequence are well known in the art (see, e.g., Sambrook et al., 
Molecular Cloning: A Laboratory Manual, CSH Press 1989. 

10 Drug Screening 

The animal models of the invention, as well as methods using the Ngn3 polypeptides 
in vitro, can be used to identify candidate agents that affect Ngn3 expression (e.g., by 
affecting Ngn3 promoter function) or that interact with Ngn3 polypeptides. Agents of 
interest can include those that enhance, inhibit, regulate, or otherwise affect Ngn3 activity 

15 and/or expression. Agents that alter Ngn3 activity and/or expression can be used to, for 
example, treat or study disorders associated with decreased Ngn3 activity [e.g., diabetes or 
other pancreatic disorders), and/or to facilitate development of islet cell precursors either in 
vitro or in vivo. Candidate agents is meant to include synthetic molecules (e.g., small 
molecule drugs, peptides, or other synthetically produced molecules or compounds, as well as 

20 recombinantly produced gene products) as well as naturally-occurring compounds (e.g., 

polypeptides, endogenous factors present in insulin-producing, hormones, plant extracts, and 
the like). 

Dru g Screening Assays 

25 Of particular interest in the present invention is the identification of agents that have 

activity m affecting Ngn3 expression and/or function. Such agents are candidates for 
development of treatments for, for example, diabetes or other condition that may be 
associated with altered Ngn3 activity. Drug screening identifies agents that provide a 
replacement or enhancement for Ngn3 function in affected cells. Conversely, agents that 

30 reverse or inhibit Ngn3 function may provide a means to regulate insulin production. Of 
particular interest are screening assays for agents that have a low toxicity for human cells. 
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The term "agent" as used herein descnbes any molecule, e.g, protein or 
pharmaceutical, with the capability of ahering or mimicking the expression or physiological 
function of Ngn3. Generally a plurality of assay naixtures are run in parallel with different 
agent concentrations to obtain a differential response to the various concentrations. 
5 Typically, one of these concentrations serves as a negative control, /.e. at zero concentration 
or below the level of detection. 

Candidate agents encompass numerous chemical classes, though typically they are 
organic molecules, preferably small organic compounds having a molecular weight of more 
than 50 and less than about 2.500 daltons. Candidate agents compose functional groups 

10 necessary for structural interaction with proteins, panicularly hydrogen bonding, and typically 
include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the 
iunctional chemical groups. The candidate agents often comprise cyclical carbon or 
heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or 
more of the above functional groups. Candidate agents are also found among biomoiecuies 

15 including, but not limited to: peptides, saccharides, fatty acids, steroids, purines, pyrimidines, 
derivatives, structural analogs or combinations thereof 

Candidate agents are obtained from a wide variety of sources including libraries of 
synthetic or natural compounds. For example, numerous means are available for random and 
directed synthesis of a wide variety of organic compounds and biomoiecuies, including 

20 expression of randomized oligonucleotides and oligopeptides. Alternatively, libranes of 

natural compounds in the form of bacterial, fungal, plant and animal extracts are available or 
readily produced. Additionally, natural or synthetically produced libraries and compounds are 
readily modified through conventional chemical, physical and biochemical means, and may be 
used to produce combinatorial libraries. Known pharmacological agents may be subjected to 

25 directed or random chemical modifications, such as acylation, alkylation. esterification, 
amidification, eic. to produce structural analogs. 

Screening of Candidate Agents In Vivo 

Agents can be screened for their ability to affect Ngn3. expression or function or to 
30 mitigate an undesirable phenotype (e.g., a symptom) associated with an alteration in Ngn3 
expression or function. In a preferred embodiment, screening of candidate agents is 
performed in vivo in a transgenic animal described herein. Transgenic animals suitable for use 
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in screening assays include any transgenic animal having an alteration in Ngn3 expression, and 
can include transgenic animaJs having, for example, an exogenous and stably transmitted 
human Ngn3 gene sequence, a reporter gene composed of a (removed human) Ngn3 
promoter sequence operably linked to a reporter gene (e.g,. (3-galactosidase, CAT, or other 
5 gene that can be easily assayed for expression), or a homozygous or heterozygous knockout 
of an Ngn3 gene. The transgenic animals can be either homozygous or heterozygous for the 
genetic alteration and, where a sequence is introduced into the animal's genome for 
expression, may contain multiple copies of the introduced sequence. Where the in vivo 
screening assay is to identify agents that affect the activity of the Ngn3 promoter, the Ngn3 
10 promoter can be operably linked to a reporter gene {e.g., luciferase) and integrated into the 
non-human host animal's genome or integrated into the genome of a cultured mammalian cell 
line. 

The candidate agent is administered to a non-human, transgemc animal having altered 
Ngn3 expression, and the effects of the candidate agent determined. The candidate agent can 

15 be administered in any manner desired and/or appropriate for delivery of the agent in order to 
effect a desired result. For example, the candidate agent can be administered by injection 
(e.g., by injection intravenously, intramuscularly, subcutaneously, or directly into the tissue in 
which the desired affect is to be achieved), orally, or by any other desirable means. Normally, 
the in vivo screen will involve a number of animals receiving varying amounts and 

20 concentrations of the candidate agent (from no agent to an amount of agent hat approaches 
an upper limit of the amount that can be delivered successfijlly to the animal), and may 
include delivery of the agent in different formulation. The agents can be administered singly 
or can be combined in combinations of two or more, especially where administration of a 
combination of agents may result in a synergistic effect. 

25 The effect of agent administration upon the transgenic animal can be monitored by 

assessing Ngn3 fijnction as appropriate (e.g., by examining expression of a reporter or fusion 
gene), or by assessing a phenotype associated with the Ngn3 expression. For example, where 
the transgenic ammai used in the screen contains a defect in Ngn3 expression (e.g., due to a 
knock-out of the gene), the effect of the candidate agent can be assessed by determining 

30 levels of hormones produced in the mouse relative to the levels produced in the Ngn3 
defective transgenic mouse and/or in wildtype mice (e.g, by assessing levels of insulin). 
Methods for assaying insulin are well known in the art. Where the in vivo screening assay is 
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to identify agents that affect the activity of the Ngn3 promoter and the non-human transgenic 
animal (or cultured mammalian cell line) comprises an Ngn3 promoter operably linked to a 
reporter gene, the effects of candidate agents upon Ngn3 promoter activit>' can be screened 
by, for example, monitoring the expression from the Ngn3 promoter (through detection of the 
5 reporter gene) and correlation of altered Ngn3 promoter activity with islet cell formation. 
.Alternatively or in addition, Ngn3 promoter activity can be assessed by detection (qualitative 
or quantitative) of Ngn3 mRNA or protein levels. Where the candidate agent affects Ngn3 
expression, and/or affects an Ngn3 -associated phenotype, in a desired manner, the candidate 
agent is identified as an agent suitable for use in therapy of an Ngn3-associated disorder 
10 and/or to facilitate development of islet precursor cells to mature (i-cells either in vivo or in 
vitro. 

Screening of Candidate Agents In Vitro 

In addition to screening of agents in Ngn3 transgenic animals, a wide variety of in 

15 vitro assays may be used for this purpose, including labeled in vitro protein-protein binding 
assays, protein-DNA binding assays, eiectrophoretic mobility shift assays, immunoassays for 
protein binding, and the like. For example, by providing for the production of large amounts 
of Ngn3 protein, one can identify ligands or substrates that bind to, modulate or mimic the 
action of the proteins. The purified protein may also be used for determination of three- 

20 dimensional crystal stnicture, v/liich can be used for modeling intermoiecular interactions, 
transcriptional regulation, eic. 

The screening assay can be a binding assay, wherein one or more of the molecules 
may be joined to a label, and the label directly or indirectly provide a detectable signal. 
Various labels include radioisotopes, tluorescers, chemiluminescers, enzymes, specific binding 

25 molecules, particles, e,g. magnetic particles, and the like. Specific binding molecules include 
pairs, such as biotin and streptavidin, digoxin and antidigoxin etc. For the specific binding 
members, the complementary member would normally be labeled with a molecule that 
provides for detection, in accordance with known procedures. 

A variety of other reagents may be included in the screening assays described herein. 

30 Where the assay is a binding assay, these include reagents like salts, neutral proteins, e.g. 
albumin, detergents, etc that are used to facilitate optimal protein-protein binding, protein- 
DNA binding, and/or reduce non-specific or background interactions. Reagents that improve 
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the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, anti-microbial 
agents, etc. may be used. The mixture of components are added in any order that provides 
for the requisite binding. Incubations are performed at any suitable temperature, typically 
between 4 and 40^C. Incubation periods are selected for optimum activity, but may also be 
5 optimized to facilitate rapid high-throughput screening. Typically between 0.1 and 1 hours 
will be sufficient. 

Other assays of interest detect agents that mimic Ngn3 function. For example, 
candidate agents are added to a cell that lacks functional Ngn3, and screened for the ability to 
reproduce Ngn3 activity in a functional assay. 

10 Many mammalian genes have homologs in yeast and lower animals. The study of such 

homologs' physiological role and interactions with other proteins in vivo or in vitro can 
facilitate understanding of biological function. In addition to model systems based on genetic 
complementation, yeast has been shown to be a powerful tool for studying protein-protein 
interactions through the two hybrid system described in Chien et al. 1991 Proc. Natl. Acad. 

1 5 Sci. USA 88:9578-9582. Two-hybrid system analysis is of particular interest for exploring 
transcriptional activation by Ngn3 proteins and to identify cDNAs encoding polypeptides that 
interact with Ngn3 . 

Identified Candidate Agents 

20 The compounds having the desired pharmacological activity may be administered in a 

physiologically acceptable carrier to a host for treatment of a condition attributable to a defect 
inNgn3 function {e.g., a disorder associated with reduced insulin levels (e.g., diabetes (Type 
1 or Type 2 diabetes, particularly Type I diabetes)). The compounds may also be used to 
enhance Ngn3 function. The therapeutic agents may be administered in a variety of ways, 

25 orally, topically, parenterally e.g. subcutaneously. intraperitoneally, by viral infection, 
intravascuiarly, eic. Inhaled treatments are of particular interest. Depending upon the 
manner of introduction, the compounds may be formulated in a variety of ways. The 
concentration of therapeutically active compound in the formulation may vary from about 
0.1-100 

30 The pharmaceutical compositions can be prepared in various forms, such as granules, 

tablets, pills, suppositories, capsules, suspensions, salves, lotions and the like. Pharmaceutical 
grade organic or inorganic carriers and/or diluents suitable for oral and topical use can be 
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used to make up compositions containing the therapeuticaJly-active corapouods. Diluents 
known to the art include aqueous media, vegetable and animal oils and fats. Stabilizing 
Agents, wetting and emulsifying Agents, salts for varying the osmotic pressure or buffers for 
securing an adequate pH value, and skin penetration enhancers can be used as auxiliary 
5 agents. 

Pharmacogenetics 

Pharmacogenetics is the linkage between an individual's genotype and that individual's 
ability to metabolize or react to a therapeutic agent. Differences in metabolism or target 

10 sensitivity can lead to severe toxicity or therapeutic failure by ahering the relation between 
bioaclive dose and blood concentration of the drug. In the past few years, numerous studies 
have established good relationships between polymorphisms in metabolic enzymes or drug 
targets, and both response and toxicity. These relationships can be used to individuaJize 
therapeutic dose administration. 

1 5 Genotyping of polymorphic alleles is used to evaluate whether an individual will 

respond well to a particular therapeutic regimen. The polymorphic sequences are also used in 
drug screening assays, to determine the dose and specificity of a candidate therapeutic agent. 
A candidate Ngn3 polymorphism is screened with a target therapy to determine whether there 
is an influence on the effectiveness in treating, for example, diabetes. Drug screening assays 

20 are performed as described above. Typically two or more different sequence polymorphisms 
are tested for response to a therapy. Therapies for diabetes currently include replacement 
therapy via administration of insulin and administration of drugs that increase insulin secretion 
(sulfonylureas) and drugs that reduce insulin resistance (such as troglitazone). 

Where a particular sequence polymorphism correlates with differential drug 

25 effectiveness, diagnostic screening may be performed. Diagnostic methods have been 
described in detail in a preceding section. The presence of a particular polymorphism is 
detected, and used to develop an effective therapeutic strategy for the affected individual. 

Detection of Ngn3 Associated Disorders 
30 Diagnosis of Ngn3 -associated disorders is performed by protein. DMA or RNA 

sequence and/or hybridization analysis of any convenient sample from a patient, e.g. biopsy 
material, blood sample, scrapings from cheek, etc. A nucleic acid sample from a patient 
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having a disorder that may be associated with Ngn3, is analyzed for the presence of a 
predisposing polymorphism in Ngn3. A typical patient genotype will have at least one 
predisposing mutation on at least one chromosome. The presence of a polymorphic Ngn3 
sequence that affects the activity or expression of the gene product, and confers an increased 
5 susceptibility to an Ngn3 associated disorder (e.g, hyperglycemia, diabetes, and the like) is 
considered a predisposing polymorphism. Individuals are screened by analyzing their DNA or 
mRNA for the presence of a predisposing polymorphism, as compared to sequence from an 
unaffected individual(s). Specific sequences of interest include, for example, any 
polymorphism that is associated with a diabetic syndrome, especially with Type 2 diabetes, or 

10 is otherwise associated with diabetes, including, but not limited to, insertions, substitutions 
and deletions in the coding region sequence, intron sequences that afteci splicing, or promoter 
or enhancer sequences that affect the activity and expression of the protein. 

Screening may also be based on the functional or antigenic characteristics of the 
protein. Immunoassays designed to detect predisposing polymorphisms in Ngn3 proteins may 

15 be used in screening. Where many diverse mutations lead to a particular disease phenotype, 
functional protein assays can be effective screening tools. 

Biochemical studies may be performed to determine whether a candidate sequence 
polymorphism in the Ngn3 coding region or control regions is associated with disease. For 
example, a change in the promoter or enhancer sequence that affects expression of NgnB may 

20 result in predisposition to diabetes. Expression levels of a candidate variant allele are 
compared to expression levels of the normal allele by various methods known in the art. 
Methods for determining promoter or enhancer strength include quantitation of the expressed 
natural protein; insertion of the variant control element into a vector with a reporter gene 
such as p-gaiactosidase, luciferase, chloramphenicol acetyitransferase, etc. that provides for 

25 convenient quantitation; and the like. The activity of the encoded Ngn3 protein may be 
determined by comparison with the wild-type protein. 

A number of methods are available for analyzing nucleic acids for the presence of a 
specific sequence. Where large amounts of DNA are available, genomic DNA is used 
directly. Aiternatively, the region of interest is cloned into a suitable vector and grown in 

30 sufficient quantity for analysis. Cells that express Ngn3 genes, such as pancreatic cells, may 
be used as a source of mRNA, which may be assayed directly or reverse transcribed into 
cDNA for analysis. The nucleic acid may be amplified by conventional techniques, such as 
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the polymerase chain reaction (PGR), to provide sufficient amounts for analysis. The use of 
the polymerase chain reaction is described in Saiki, et al. 1985 Science 239:487; a review of 
current techniques may be found in Sarabrook, et al. Molecular Cloning; A Laboratory 
Manual CSH Press 1989, pp. 14.2-14.33. Amplification may also be used to determine 
5 whether a polymorphism is present, by using a primer that is specific for the polymorphism. 
Alternatively, various methods are known in the art that utilize oligonucleotide ligation as a 
means of detecting polymorphisms, for examples see Riley et al. 1 990 Nucl. Acid Res. 
18:2887-2890; and Delahunty et aL 1996 Am. J. Hum. Genet. 58:1239-1246. 

A detectable label may be included m an amplification reaction. Suitable labels include 

10 fluorochromes, e.g. fluorescein isothiocyanate (FITC), rhodamine, Texas Red, phycoer>thrin, 
aliophycocyanin, 6-carboxyfluorescein (6-FAM), 2',7'-dimethoxy-4',5'-dichloro-6- 
carboxyfluorescein (JOE), 6-carboxy-X-rhodamine (ROX), 6-carboxy-2\4',7',4,7~ 
hexachlorofiuorescein (HEX), 5-carboxyfluorescein (5-FAM) or N,N,N\N'-teiramethyl-6- 
carbox}Thodamine (TAMRA), radioactive labels, e.g. "P, ^^S, ^H; etc. The label may be a 

1 5 two stage system, where the amplified DMA is conjugated to biotin, haptens, etc. having a 
high affinity binding partner, e.g. avidin, specific antibodies, etc., where the binding partner is 
conjugated to a detectable label. The label may be conjugated to one or both of the primers. 
Alternatively, the pool of nucleotides used in the amplification is labeled, so as to incorporate 
the label into the amplification product. 

20 The sample nucleic acid, e.g. amplified or cloned fi-agment, is analyzed by one of a 

number of methods known in the art. The nucleic acid may be sequenced by dideoxy or other 
methods, and the sequence of bases compared to either a neutral Ngn3 sequence {e.g., an 
Ngn3 sequence fi"om an unaffected individual). Hybridization with the variant sequence may 
also be used to determine its presence, by Southern blots, dot blots, etc. The hybridization 

25 pattern of a control and variant sequence to an array of oligonucleotide probes immobilized 
on a solid support, as described in US 5,445,934, or in WO95/35505, may also be used as a 
means of detecting the presence of vananl sequences. Single strand conformational 
polymorphism (SSCP) analysis, denaturing gradient gel electrophoresis (DGGE), rmsraatch 
cleavage detection, and heteroduplex analysis in gel matrices are used to detect 

30 conformational changes created by DNA sequence variation as alterations in electrophoretic 
mobility. Alternatively, where a polymorphism creates or destroys a recognition site for a 
restriction endonuclease (restriction fragment length polymorphism, RFLP), the sample is 
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digested with that endonuclease, and the products size fractionated to determine whether the 
fragment was digested. Fractionation is performed by gel or capillary electrophoresis, 
particuiarly acrylamide or agarose gels. 

The hybridization pattern of a controi and van ant sequence to an array of 
5 oligonucleotide probes immobilized on a solid support, as described in US 5,445,934, or in 
WO95/35505, may be used as a means of detecting the presence of variant sequences. In one 
embodiment of the invention, an array of oligonucleotides are provided, where discrete 
positions on the array are complementary to at least a portion of mRNA or genomic DNA of 
the Ngn3 locus. Such an array may comprise a series of oligonucleotides, each of which can 

10 specifically hybridize to a nucleic acid sequence, e.g,. mRNA, cDNA, genomic DNA, etc. 
from the Ngn3 locus. Usually such an array will include at least 2 different polymorphic 
sequences, i.e. polymorphisms located at unique positions within the locus, usually at least 
about 5, more usually at least about 10, and may include as many as 50 to 100 different 
polymorphisms. The oligonucleotide sequence on the array will usually be at least about 12 

15 nt in length, may be the length of the provided polymorphic sequences, or may extend into the 
flanking regions to generate fragments of 100 to 200 nt in length. For examples of arrays, see 
Hacia et al. 1996 Nature Genetics 14:441-447; Lockhart et al. 1996 Nature Biotechnol. 
14:1675-1680; and De Risi et al. 1996 Nature Genetics 14:457-460. 

Antibodies specific for Ngn3 polymorphisms may be used in screening immunoassays. 

20 A reduction or increase in Ngn3 and/or presence of an Ngn3 disorder associated 

polymorphism is indicative that the suspected disorder is Ngn3-associated. A sample is taken 
from a patient suspected of having an Ngn3-associated disorder. Samples, as used herein, 
include tissue biopsies, biological fluids, organ or tissue culture derived fluids, and fluids 
extracted from physiological tissues, as well as derivatives and fractions of such fluids. The 

25 number of cells in a sample will generally be at least about 10^, usually at least 10* more 

usually al least about 10^ The cells may be dissociated, in the case of solid tissues, or tissue 
sections may be analyzed. Alternatively a lysate of the ceils may be prepared. 

Diagnosis may be performed by a number of methods. The different methods ail 
determine the absence or presence or altered amounts of normal or abnormal Ngn3 in patient 

30 cells suspected of having a predisposing polymorphism in Ngn3. For example, detection may 
utilize staining of cells or histological sections, performed in accordance with conventional 
methods. The antibodies of interest are added to the cell sample, and incubated for a period 
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of time sufficient to allow binding to tbe epitope, usually ai least about 10 minutes. The 
antibody may be labeled with radioisotopes, enzymes, fluorescers, chemiluminescers, or other 
labels for direct detection. Alternatively, a second stage antibody or reagent is used to 
amplify the signal. Such reagents are well known in the art. For example, the primary 
5 antibody may be conjugated to biotin, with horseradish peroxidase-conjugated avidin added 
as a second stage reagent. Final detection uses a substrate that undergoes a color change in 
the presence of the peroxidase. The absence or presence of antibody binding may be 
determined by various methods, including flow cytometry of dissociated cells, microscopy, 
radiography, scintillation counting, etc. 

1 0 An alternative method for diagnosis depends on the in vitro detection of binding 

between antibodies and NgnS in a lysate. Measuring the concentration of MgnB binding in a 
sample or fraction thereof may be accomphshed by a variety of specific assays. A 
conventional sandwich type assay may be used. For example, a sandwich assay may first 
attach Ngn3 -specific antibodies to an insoluble surface or support. The particular manner of 

1 5 binding is not crucial so long as it is compatible with the reagents and overall methods of the 
invention. They may be bound to the plates covalently or non-covalentiy, preferably non- 
covalently. 

The insoluble supports may be any composhions to which polypeptides can be bound, 
which is readily separated from soluble material, and which is otherwise compatible with the 

20 overall method. The surface of such supports may be soiid or porous and of any convenient 
shape. Examples of suitable insoluble supports to which the receptor is bound include beads, 
e.g. magnetic beads, membranes and microtiter plates. These are typically made of glass, 
plastic (e.g. polystyrene), polysaccharides, nylon or nitrocellulose. Microtiter plates are 
especially convenient because a large number of assays can be carried out simultaneously, 

25 using small amounts of reagents and samples. 

Patient sample lysates are then added to separately assayable supports (for example, 
separate wells of a microliter plate) containing antibodies. Preferably, a series of standards, 
contaimng known concentrations of normal and/or abnormal Ngn3 is assayed in parallel with 
the samples or aliquots thereof to serve as controls. Preferably, each sample and standard will 

30 be added to multiple wells so that mean values can be obtained for each. The incubation time 
should be sufficient for binding, generally, from about 0. 1 to 3 hr is sufficient. After 
incubation, the insoluble support is generally washed of non-bound components. Generally, a 
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dilute non-ionic detergent medium at an appropriate pH, generally 7-8, is used as a wash 
medium. From one to six washes may be employed, with sufficient volume to thoroughly 
wash non-specificaJly bound proteins present in the sample. 

After washing, a solution containing a second antibody is applied. The antibody will 
5 bind Ngn3 with sufficient specificity such that it can be distinguished from other components 
present. The second antibodies may be labeled to facilitate direct, or indirect quantification of 
binding. Examples of labels that permit direct measurement of second receptor binding 
include radiolabels, such as or fluorescers, dyes, beads, chemiluminescers, colloidal 
particles^ and the like. Examples of labels which permit indirect measurement of binding 

1 0 include enzymes where the substrate may provide for a colored or fluorescent product. In a 
preferred embodiment, the antibodies are labeled with a covalently bound enzyme capable of 
providing a detectable product signal after addition of suitable substrate. Examples of 
suitable enzymes for use in conjugates include horseradish peroxidase, alkaline phosphatase, 
malate dehydrogenase and the like. Where not commercially available, such antibody-enzyme 

1 5 conjugates are readily produced by techniques known to those skilled in the art. The 
incubation time should be sufficient for the labeled ligand to bind available molecules. 
Generally, from about 0. 1 to 3 hr is sufficient, usually 1 hr sufficing. 

After the second binding step, the insoluble support is again washed free of non- 
specifically bound material. The signal produced by the bound conjugate is detected by 

20 conventional means. Where an enzyme conjugate is used, an appropriate enzyme substrate is 
provided so a detectable product is formed. 

Other immunoassays are known in the art and may find use as diagnostics. 
Ouchterlony plates provide a simple determination of antibody binding. Western blots may be 
performed on protein gels or protein spots on fihers, using a detection system specific for 

25 Ngn3 as desired, converuently using a labeling method as described for the sandwich assay. 

Other diagnostic assays of interest are based on the functional properties of Ngn3 
proteins. Such assays are particularly useiul where a large number of different sequence 
changes lead to a common phenotype. For example, a functional assay may be based on the 
transcriptional changes mediated by Ngn3 gene products. Other assays may, for example, 

30 detect conformational changes, size changes resulting from insertions, deletions or 
truncations, or changes in the subcellular localization of Ngn3 proteins. 
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In a protein taincaiion test, PGR fragments amplified from the Ngn3 gene or its 
transcript are used as templates for in vivo transcription/translation reactions to generate 
protein products. Separation by gel electrophoresis is performed to determine whether the 
polymorphic gene encodes a truncated protein, where truncations may be associated with a 
5 loss of function. 

Diagnostic screening may also be performed for polymorphisms that are genetically 
linked to a predisposition for diabetes, particularly through the use of microsatellite markers 
or single nucleotide polymorphisms. Frequently the microsatellite polymorphism itself is not 
phenotypicaily expressed, but is linked to sequences that result in a disease predisposition. 
10 However, in some cases the microsatellite sequence itself may affect gene expression. 
Microsatellite linkage analysis may be performed alone, or in combination with direct 
detection of polymorphisms, as described above. The use of microsatellite markers for 
genotyping is well documented. For examples, see N4ansfield et al. 1994 Genomics 24:225- 
233; Ziegle et al. 1992 Genomics 14:1026-1031; Dib et al., supra. 
15 Microsatellite loci that are useful in the subject methods have the general formula: 

U (R)„U', where 

U and U' are non-repetitive flanking sequences that uniquely identify the particular locus, R is 
a repeat motif, and n is the number of repeats. The repeat motif is at least 2 nucleotides m 
length, up to 7, usually 2-4 nucleotides in length. Repeats can be simple or complex. The 

20 flanking sequences U and U' uniquely identify the microsatellite locus within the human 
genome. U and U' are at least about 18 nucleotides in length, and may extend severaJ 
hundred bases up to about 1 kb on either side of the repeat. Witliin U and U\ sequences are 
selected for amplification primers. The exact composition of the primer sequences are not 
critical to the invention, but they must hybridize to the flanking sequences U and U*, 

25 respectively, under stringent conditions. Criteria for selection of amplification primers are as 
previously discussed. To maximize the resolution of size differences at the locus, it is 
preferable to chose a primer sequence that is close to the repeat sequence, such that the total 
amplification product is between 100-500 nucleotides in length. 

The number of repeats at a specific locus, n, is polymorphic in a population, thereby 

30 generating individual differences in the length of DNA that lies between the amplification 
primers. The number will vary from at least 1 repeat to as many as about 1 00 repeats or 
more. 
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The primers are used to amplify the region of genomic DNA that contains the repeats. 
Conveniently, a detectable label will be included in the amplification reaction, as previously 
described. iMultipiex amplification may be performed in which several sets of primers are 
combined in the same reaction tube. This is particularly advantageous when limited amounts 

5 of sample DNA are available for analysis. Conveniently, each of the sets of primers is labeled 
with a different fluorochrome. 

After amplification, the products are size fractionated. Fractionation may be 
performed by gel electrophoresis, particularly denaturing acpy^lamide or agarose geis. A 
convenient system uses denaturing polyacr>'lamide gels in combination whh an automated 

10 DNA sequencer, see Hunkapillar et al. 1991 Science 254:59-74. The automated sequencer is 
particularly usefiil with multiplex amplification or pooled products of separate PCR reactions. 
Capillary electrophoresis may also be used for fractionation. A review of capillary 
electrophoresis may be found in Landers, et al. 1993 BioTechniques 14:98-1 1 1. The size of 
the amplification product is proportional to the number of repeats (n) that are present at the 

15 locus specified by the primers. The size will be polymorphic in the population, and is 
therefore an allelic marker for that locus. 

Therapeutic Uses of Ngn3 -Encoding Nucleic Acid 

Ngn3-encoding nucleic acid can be introduced into a eel) to accomplish 
transformation of the cell, preferably stable transformation, and the transformed cell 
subsequently implanted into a subject having a disorder characterized by a deficiency in 
insulin {e.g., an Ngn3 -associated disorder), depending upon the tissue into wl"uch the 
transformed cell is implanted. Preferably, the host cell to be transformed and implanted in the 
subject is derived from the individual who will receive the transplant (e.g.. to provide an 
autologous transplant). Where the transformed cells are to be inserted into individual (e.g., 
into the pancreas, liver, abdominal cavity, etc.), the ceils into which the nucleic acid is 
introduced are preferably stem cells capable of developing into P cells wiiiiin the pancreatic 
tissue environment, e.g., stem cells derived from pancreatic tissue, gastrointestinal tissue, or 
cells capable of expression of insuhn upon expression of the Ngn3-encoding nucleic acid. 

For example, in a subject having Type 1 diabetes, gastrointestinal stem cells can be 
isolated from the affected subject, the cells transformed with Ngn3 -encoding DNA, and the 
transformed cells implanted in the affected subject to provide for insulin production, or the 
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transformed cells cultured so as to facilitate development of the cells into insulin-producing (i- 
cells. 

Introduction of the Ngn3*encoding nucleic acid into the cell can be accomplished 
according to methods well known in the art (e.g., through use of electroporation, 
5 microinjection, lipofection infection with a recombinant (preferably replication-deficient) 
virus, and other means well known in the art). Preferably, the Ngn3 -encoding nucleic acid is 
operably linked to a promoter that facilitates a desired level of Ngn3 polypeptide expression 
(e.g., a promoter derived from CMV, SV40, adenovirus, or a tissue-specific or cell type- 
specific promoter). Transformed cells containing the Ngn3 -encoding nucleic acid can be 

10 selected and/or enriched via, for example, expression of a selectable marker gene present in 
the Ngn3 -encoding construct or that is present on a plasmid that is co-transfected with the 
Ngn3-encoding construct. Typically selectable markers provide for resistance to antibiotics 
such as tetracycline, hygromycin, neomycin, and the like. Other markers can include 
thymidine kinase and the like. 

1 5 The abihty of the transformed cells to express the Ngn3 -encoding nucleic acid can be 

assessed by various methods known in the art. For example, Ngn3 expression can be 
examined by Northern blot to detect mRNA which hybridizes with a DNA probe derived 
from the relevant gene. Those cells that express the desired gene can be further isolated and 
expanded in in vitro culture using methods well known in the art. The host cells selected for 

20 transformation with Ngn3 -encoding DNA will vary with the purpose of the ex vivo therapy 
(e.g., insulin production), the site of implantation of the cells, and other factors that will van' 
with a variety of factors thai will be appreciated by the ordinarily skilled anisan. 

Methods for engineering a host cell for expression of a desired gene product(s) and 
implantation or transplantion of the engineered cells (e.g., ex vivo therapy) are known in the 

25 art (see, e.g., Gilbert et al. 1993 "Cell transplantation of genetically altered cells on 

biodegradable polymer scaffolds in syngeneic rats," Transplantation 56:423-427). For 
expression of a desired gene in exogenous or autologous ceils and implantation of the cells 
(e.g., islet cells) into pancreas, see, e.g., Docheny 1997 "Gene therapy for diabetes mellitus," 
Clin Sci (Colch) 92:321-330; Hegre et al. 1976 "Transplantation of islet tissue in the rat," 

30 Acta Endocrinol Suppl (Copenh) 205:257-281; Sandler et al. 1997 "Assessment of insulin 
secretion in vitro from microencapsulated fetal porcine islel-like cell clusters and rat, mouse, 
and human pancreatic islets," Transplantation 63:1712-1718; Calafiore 1997 "Perspectives in 
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pancreatic and islet cell transplantation for the therapy of EDDM," Diabetes Care 20:889-896; 
Kenyon et al. 1996 "Islet cell transplantation: beyond the paradigms," Diabetes Metab Rev 
12:361-372; Sandler; Chick et al. 1977 Science "Artificial pancreas using living beta cells:, 
effects on glucose homeostasis in diabetic rats," 197:780-782. 
5 After expansion of the transformed cells in vitro, the cells are implanted into the 

mammalian subject, preferably into the tissue from which the cells were originally derived, by 
methods well known in the art. The number of ceils implanted is a number of cells sufficient 
to provide for expression of levels of Ngn3 sufficient to provide for enhanced levels of 
insulin. The number cells to be transplanted can be determined based upon such factors as the 

10 levels of polypeptide expression achieved in vitro, and/or the number of cells that survive 
implantation. Preferably the ceils are implanted in an area of dense vascularization, and in a 
manner that minimizes evidence of surgery in the subject. The engraflment of the implant of 
transformed cells is monitored by examining the mammalian subject for classic signs of graft 
rejection, i.e., inflammation and/or exfoliation at the site of implantation, and fever. 

15 Alternatively, Ngn3 -encoding nucleic acid can be delivered directly to an affected 

subject to provide for Ngn3 expression in a target cell (e.g., a pancreatic cell, gut cell, liver 
cell, or other organ cell capable of expressing Ngn3 and providing production of insulin), 
thereby promoting development of the cell into an insulin-producing cell (e.g., in pancreas) or 
to cure a defect in Ngn3 expression in the subject. Methods for in vivo dehvery of a nucleic 

20 acid of interest for expression in a target cell are known in the art. For example, in vivo 

methods of gene delivery normally employ either a biological means of introducing the DNA 
into the target cells (e.g., a virus containing the DNA of interest) or a mechanical means to 
introduce the DNA into the target cells (e.g., direct mjection of DNA into the cells, liposome 
fusion, pneumatic injection using a ''gene gun," or introduction of the DNA via a duct of the 

25 pancreas). For other methods of introduction of a DNA of interest into a cell in vivo, also see 
Bartlett et al. 1997 "Use of biolistic particle accelerator to introduce genes into isolated islets 
of Langerhans," Transplant Proc 29:2201-2202; Furth 1997 "Gene transfer by biolistic 
process," .Mol Biotechnol 7 139-143; Gainer et al. 1996 "Successful biolistic transformation 
of mouse pancreatic islets while preserving cellular function," Transplantation 61 : 1567- 1571 ; 

30 Docherty 1997 "Gene therapy for diabetes mellitus," Clin Sci (Colch) 92:32 1 -330, Maeda et 
al. 1994 "Gastroenterology 1994 "Adenovirus-mediated transfer of human lipase 
complementary DNA to the gallbladder," 106:1638-1644. 
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The amount of DNA and/or the number of infectious viral particles effective to infect 
the targeted tissue, transform a sufficient number of cells, and provide for production of a 
desired level of insulin can be readily determined based upon such factors as the efficiency of 
the transformation />? vitro and the susceptibility of the targeted secretory gland cells to 
5 transformation. For example, the amount of DNA injected into the pancreas of a human is, 
for example, generally from about I ^ig to 750 mg, preferably from about 500 pg to 500 mg, 
more preferably from about 10 mg to 200 mg, most preferably about 100 mg. Generally, the 
amounts of DNA can be extrapolated from the amounts of DNA effective for dehvery and 
expression of the desired gene in an animal model. For example, the amount of DNA for 

10 delivery in a human is roughly 1 00 times the amount of DNA effective in a rat. 

Regardless of whether the Ngn3-encoding DNA is introduced in vivo or ex vivo, the 
DNA (or cells expressing the DNA) can be administered in combination with other genes and 
other agents. In addition, Ngn3-encoding DNA (or recombinant cells expressing Ngn3 DNA) 
can be used therapeutically for disorders associated with, for example, a decrease in insulin 

15 production, but which are not associated with an alteration in Ngn3 fianction per se. For 

example, an increase in Ngn3 may cause an increase in the number of mature p cells, and thus 
an increase in insulin production, in an individual that has decreased insulin production from 
some other cause not related to function of Ngn3. 

20 Identification of Islet Cell Precursors and Development of p-Cells Ijsing Ngn3 

As described in more detail in the Examples below, the temporal and spatial pattern of 
Ngn3 expression indicates that Ngn3 can be used as a marker for islet cell precursors. This 
feature of Ngn3 expression can be exploited to provide compositions and methods to jdentiiy 
and isolate islet cell precursors. For example, pancreatic tissue can be obtained from a 

25 subject, and a single cell suspension obtained from the tissue. The single cell cultures can 

then be expanded in culture, and representative cells from the single cell cultures analyzed for 
Ngn3 expression. Ngn3 expression can be analyzed by, for example, detection of Ngn3- 
encoding mRNA'(^.^., by PCR amplification using a probe derived from an Ngn3-encoding 
sequence) or by detection of the Ngn3 polypeptide in cell lysates using an anti-Ngn3 

30 antibody. Cells that express Ngn3 are identified as being islet cell precursors. The cells of the 
corresponding culture could then be expanded and/or used to derive mature p-celis in culture. 
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and the mature p-ceils implanted into the subject, e.g., either into the same subject from 
whom the cells were initially obtained or into a different subject. 

Ngn3 is also useful for monitoring development of islet cell precursors into mature 3- 
cells. In short, Ngn3 expression can be monitored in an in vitro culture to determine when 
5 the ceils become mature jj-cells. For example, cells that express Ngn3 are at an earlier stage 
of P-cell development. Once Ngn3 expression decreases or becomes substantially 
undetectable, the cell can be identified as having developed into a mature p-ceil. The ceils can 
be screened for other markers of islet cell development, as well as for insulin production. 



10 EXAMPLES 

The following examples are put forth so as to provide those of ordinary skiJl in the art 
With a complete disclosure and description of how to carry out the invention and is not 
intended to limit the scope of what the inventors regard as their invention. Efforts have been 
made to ensure accuracy with respect to numbers used (e.g., amounts, temperatures, etc), 
15 but some experimental error and deviation should be accounted for. Unless indicated 

otherwise, parts are parts by weight, molecular weight is weight average molecular weight, 
temperature is in degrees Centigrade, and pressure is at or near atmospheric. 

E;^ample 1 : Detection of Ngn3 Expression in Murine Pancreas 

20 Members of the basic helix-loop-helix (bHLH) family of transcription factors regulate 

growth and differentiation of numerous cell types. Insulin gene expression is activated by a 
heterodimeric complex of two bHLH proteins: a ubiquitously expressed (class A) protein and 
a cell-type-specific (class B) partner, BETA2/neuroDl . BETA2/neuroD 1 is aJso important 
for p-cell development. The targeted disruption of the BET -\2/neuroD 1 gene in mice leads to 

25 a marked reduction of the P-cell mass at birth due to increased apoptosis of islet cells late in 
fetal development. There is no apparent defect, however, in P-cell formation or insulin gene 
expression, despite the postulated importance of this factor m P-cell differentiation. 

Assuming that tliis modest phenotype reflected the redundant expression of closely 
related class B bHLH proteins in the endocrine pancreas, the inventors searched for additional 

30 members of the family by reverse transcriptase-polymerase chain reaction (RT-PCR) using 
degenerate oligonucleotides primers based on conserved amino acid sequences in the bHLH 
domain of the class B bHLH proteins (Sommer et al 1996 IvioL Cell Neurosci. 8:221). PGR 
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analysis revealed that pancreatic endocrine cell lines and isolated adult islets not only express 
neuroDl, but also several other members of the family of neural class B bHLH genes as well, 
including mashl, neuroD2 and 4 and neurogenins (ngn) 1, 2 and 3. This remarkable degree 
of redundancy could compensate for the loss of BETA2/neuroDI in mice. The two most 
5 conunonly amplified sequences encoded neuroD4 and Ngn3, but in situ hybridization studies 
in mouse pancreas showed highest expression of neuroDl and Ngn3. These results were 
confirmed by immunohistochemistry. 

Ngn3 is detected earliest at embryonic day 11. 5 (el 1.5) in the mouse, increases to a 
maximum at el5.5 and decreases at el8.5, with no staining seen in the adult pancreas. Ngn3 

10 is delected in the nuclei of scattered ductal cells and periductal cells, and there was no 
co-staining with any of the four islet hormones (insulin, glucagon, somatostatin and 
pancreatic polypeptide). This temporal and spatial pattern of expression implicated Ngn3 as a 
marker for islet cell precursors. Nkx6. 1 , a specific marker for future beta-cells, was 
expressed in 10-20% of the Ngn3 positive cells, fiarther supporting the use of Ngn3 as a 

15 marker for islet cell precursors. The peak of Ngn3 expression at el 5.5 also corresponds with 
the peak of new beta-cell fonnation in the fetus. Our data supports a model in which Ngfi3 
acts upstream of BETA2/neuroDl and other islet differentiation factors, marking islet cell 
precursors, but switching off prior to final diflTerentiation. 

20 Example 2: Isolation and Sequencing of a Human NgnS Polypeptide-Encoding 

Polynucleotide 

A probe derived from a cloned fragment of the murine Ngn3 gene (Sommer ei al,, 
supra) was used to screen a human genomic libra^>^ This screen resulted in the isolation of 
the genomic sequence provided as SEQ ED N0:1 in the sequence listmg. Based on mapping 

25 of the murine start site using 5' RACE of mouse fetal pancreatic RNA, the transcriptional 
start site in the human Ngn3-encoding sequence is at nucleotide residue 2643 . The coding 
sequence is between nucleotide residues 3022-3663, Avith a stop site at 3664-3666. No 
introns are within the 5' untranslated region (UTR) or the coding sequence of SEQ ED NO: I. 
The promoter of Ngn3 is of interest, panicularly given that is it exceptionally weil- 

30 conserved between mouse, rat, and human. Given the role of Ngn3 in pancreatic and islet cell 
development, the Ngn3 promoter is likely key to determining the number of islet cells in the 
mature pancreas. The regulatory region corresponding to the human Ngn3 promoter 
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comprises sequences up to approximately 500 bp upstream of the transcription start site 
within the human Ngn3 promoter (e.g,, from about 2144 to the transcriptional start site at 
2643). 

FISH was used to identify the location of Ngn3 on the human chromosome at 
10q22. 1-22.2. 

Examples. Isolation and Sequencin g of a Murine Ngn3 Polvpeptide-Encoding 

PQly.Ou<;l,gC>ti<ig 9nd Prpmoter 
The full-length murine Ngn3 sequence and its 5' flanking sequences, which included 
the murine Ncn3 promoter, were obtained by sequencing a previously obtained mouse 
genomic DNA fragment (Sommer, et aL, supra). The murine Ngn3 sequence is provided in 
the Sequence Listing as SEQ ID N0:3, with the encoded polypeptide provided as SEQ ID 
N0:4. The transcriptional start site was determined using the 5' RACE method and 
confirmed using Rnase protection with RNA from fetal mouse pancreas, and iis at nucleotide 
residue 719; the coding sequence for murine Ngn3 begins at nucleotide residue 1 093. The 
promoter composes a region approximately 500 bp upstream of the transcription start site. 

The invention now being fully described, it will be apparent to one of ordinary skill in 
the an that many changes and modifications can be made thereto without departing from the 
spirit or scope of the appended claims. 
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CLAIMS 

What is claimed is; 

5 1. An isolated human neurogenin3 (Ngn3) polypeptide. 

2. The human Ngn3 polypeptide of claim 1, comprising an amino acid sequence of 
SEQ ID N0:2. 

10 3. The human Ngn3 polypeptide of claim 1 comprising an amino acid sequence 

having at least about 70% amino acid sequence identity with the amino acid sequence of SEQ 
IDN0:2. 

4 An isolated polynucleotide sequence or complement thereof comprising a 
15 polynucleotide sequence encoding a human Ngn3 polypeptide of claim 1 . 

5. The isolated polynucleotide of claim 4, wherein the Ngn3 polypeptide has an 
amino acid sequence that is substantially identical to the amino acid sequence of SEQ ID 
N0:2. 

20 

6. The isolated polynucleotide sequence of claim 4 comprising a polynucleotide 
sequence of nucleotides 3022-3063 of SEQ ID N0:1. 

7. An isolated polynucleotide sequence that hybridizes under stringent conditions to 
25 the polynucleotide sequence of nucleotides 3022-3063 of SEQ ID NO: 1 . 

8. A recombinant expression vector comprismg the polynucleotide sequence of claim 

4. 

30 9. An isolated recombinant host cell comprising a polynucleotide sequence encoding 

the polypeptide of claim L 



-40- 



wo 00/59936 



PCT/USOO/08436 



10. A method for producing the human Ngn3 polypeptide of claim 1, the method 
comprising the steps of: 

a) culturing a recombinant host cell containing a human Ngn3 poiypeptide- 
encoding polynucleotide sequence under conditions suitable for the expression of the 

5 polypeptide; and 

b) recovering the polypeptide from the host cell culture. 

11. An isolated antibody that specifically binds a human Ngn3 polypeptide of claim 1 . 

10 12. A method for identifying a polynucleotide homologous to the polynucleotide of 

claim 4. the method comprising the steps of; 

contacting a polynucleotide probe with a test polynucleotide, the probe comprising at 
least 15 contiguous nucleotides of a polynucleotide sequence encoding a human Ngn3 
polypeptide; and 

1 5 detecting hybridization of the probe with the test polynucleotide; 

wherein detection of hybridization of the probe to the test polynucleotide indicates 
that the polynucleotide shares sequence homology with the human Ngn3 poiypeptide- 
encoding polynucleotide. 



20 13. A method for identifying an islet cell precursor, the method comprising the step 

of analyzing a cell for expression of an neurogenin3 (Ngn3) gene product, wherein detection 
of the Ngn3 gene product is indicative of an islet cell precursor. 

14. An isolated nucleic acid sequence comprising a neurogenin3 (Ngn3) promoter. 

25 

15. The isolated nucleic acid sequence of claim 14, wherein the Ngn3 promoter is a 
human neurogenin 3 promoter. 

16. The isolated nucleic acid sequence of claim 14, wherein the sequence comprises a 
30 nucleotide sequence of a region 5' of nucleotide residue 2643 of SEQ TD NO; 1 . 
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17. The isolated nucleic acid sequence of claim 14, wherein the Ngn3 promoter is a 
murine neurogemn3 promoter. 

1 8. The isolated nucleic acid sequence of claim 1 7, wherein the Ngn3 promoter 

5 comprises a nucleotide sequence of a region 5' of nucleotide residue 719 of SEQ ID N0:3. 

19. A method for identifying a biologically active agent that modulates human 
neurogenin3 {Ngn3) activity, the method comprising: 

combining a candidate agent with any one of 
10 (a) a human Ngn3 polypeptide; 

(b) a recombinant cell comprising a nucleic acid encoding a human Ngn3 polypeptide; 

or 

(c) a recombinant cell comprising a nucleic acid encoding a mammalian Ngn3 
promoter sequence operabiy linked to a nucleic acid encoding a report gene;and 

1 5 determining the elfect of said agent on Ngn3 activity, 

20. A method for detecting in a subject a predisposition to a defect in pancreatic islet 
cells function or formation associated with a defect in neurogenjn3 (Ngn3) activity, the 
method comprising: 

20 analyzing the genomic DNA or mRNA of an individual for the presence of at least one 

predisposing alteration in a genomic Ngn3 sequence: 

wherein the presence of the altered genomic Ngn3 sequence is indicative of an 
increased susceptibility to a defect in pancreatic islet cell function or formation. 

25 21 . The method of claim 20, wherein the alteration is in an Ngn3 promoter sequence. 

22. The method of claim 20, wherein the alteration is in a genomic sequence encodmg 
an Ngn3 polypeptide. 
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23. A method for producing a human pancreatic islet cell, the method comprising; 
identifying a human pancreatic islet cell precursor by detection of expression of human 

neurogeninS (Ngn3); and 

expanding the identified precursor cell in vitro; 

wherein expansion of the identified cells produces a human pancreatic islet cell. 

24. A pancreatic islet ceil produced by the method of claim 23. 
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SEQUENCE LISTING 



<110> German, Michaei S. 
Lin, Joseph 



<120> HUMAN NEUROGENIN 3-ENCODING NUCLEOTIDE 
SEQUENCES 



<130> UCSF-129WO 

<140> 60/128, 180 
<141> 1999-04-09 



<160> 4 



<170> FastSEQ for Windows Version 3,0 



<210> 1 
<211> 5340 
<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 

<222> (3022) . . . (3666) 

<223> Coding sequence of human neurogeninS 



<400> 1 

ggatccctcg tggccagggt tcccttcaag gtgcttagcc aggtcaggag gccctagaga 60 

agcatggttt ggattttctt tcccagacca aaaaagctcc aagttggttc tctcccagtt 120 

tctaacttgc agttaaataa atcaggcaag gctggcctat gaggcagaca agtgtgaaga 180 

aggagaagga ggaggagaag gagaaggaga aagaagaaga aggaggagaa gaagaagaag 240 

aagaagaaga agaagaggag gaggaggagg aggaggagga agcagcagca gcagcagcag 300 

cttgaatgga cagtggttcc ccttgcctag aaaatgggac cattatttct tttctaatct 360 

gacccccaga ctcaggactt cctctatttt ctgcattttg gggtctcttg ttttgccttg 420 

aaaaaaaatg trttctccca aatcaaggag cagtagctgg tgcaagggaa aatctagggc 480 

taggagtctt aagatatgac ttctatgtgg ttctgataga acttgctggg tgaccttgag 540 

agagtcactc cccctctctg ggccttgatt ttttcatctt taaagaaggc ctcaaattcc 600 

cattcttatg agaagaagac aagctcctag tgagtggtga cctaagggag cagctgcagc 660 

aaaatgctaa cctgacagtc ccagatggtc cctttattgg ttctgaccct ggtctcaggc 720 

ttcatttccc cacagcaagg gaaggagcct gcrcacagag caccagctaa gatcagcagg 780 

accgcgccac acccccgccc agtcctagag cccccctctc gctggttcct gagcatacca 840 

ccctcttcct tggaggaaaa tttgccccca agcagcctag gcggtaagag gctatcacta 900 

gggcagactc acagacctac ctcatcccct caccccaccc tacagtctcg aagtcgggtc 960 

ctgtcccctc ctgcagtttc cgggagactc aggatatctg gacctgctag aaagagaagc 1020 

cttcctcgcc taaggagact taaaccggga tacttaaacc tcccgcctcg gcgtcttcct 1080 

ccaggcacga ccgggtcaag agagagaagc ggaagctgca acccctcact ctgagtgacc 114 0 

ggaagcagaa gaccacggga tgtcccaggc ggggacaaga ggaggggctg gggaagaaag 1200 

gagggatgat gagttcagag tccctttgga aaggtttcca gagagcgcta ccagggacaa 1260 

cccaaggggc tggggaagtc cctgccttgt gctctctctg cgatgcccga gtgatgcaga 1320 

ggcagggggc tggagcaggt gactgctggc agctgctgtc tgtctgtgat tggaccggag 1380 

gactaagggg agaaaaagtt. tatcagcttc tcccagtgcc tgcacgctgt ggtagttcaa 1440 

aagacacgag ggggaggggc acagcagctc tgcttcccag cgccttggga gactgaagtg 1500 

aaaggaacgc trgagcccag gagttcgaga ccatcctggg caacaaagca agaccgcccc 1560 

tcaccccata caaaataaaa atacaaataa attagccggg cacagtggcg catgcctgta 1620 

gtctcagcta ctgggaaggc tgaagtggga ggatagcttg agcccaggag atcaaggctg 1680 

cagtgagctg tgattgcacc actgcagtcc agcctgggcg acagaaggag accgtttttt 1740 

ggttttgttt gztcgtttaa aaaaaaaaag aagcaagagc tcactgtgaa ctcctggttc 1800 

cttcctcccc tcctcacact tcccagaact cttcctgtca cggttcctgg ccagaacgct 1860 

gggatactat ctacaagctg tagtaggctr gtagtaatgg aatgtccgct tgaggggtcc 1920 

ccgcacagcc aaccccggcc tctggagtgg gatctatggg ggtggggttc taagcgcctc 1980 

tggggagtgt gaggtagcat ctcagggtgt ggcagaggec cggacacccc caaaaggtct 2040 
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gtgaatggaa gggacatagg caggatctct ctcagtgatg tcccctgtct tccaggatga 2100 

agagagccag tgaaacacca ggagagcagg gcgtccttta gaattcctgg acccttctcc 2160 

aggctgctag tcaggacaat gagctcgtgg ttgtctttgc cactatcttc ccgtgcgatt 2220 

tcagacaagc cacctccctc actaagccta aatttcccca tgtgtaacgt gcaggcattg 22B0 

taccctagag gcatcaaagt cccctccagg acagatgcta aggaaagaca ggctaggagc 2340 

aaagccgtct gaggtggcct gaccagagcc acacgaggct cttctcactg ggcgaggctc 24 00 

tttgaggaac cgagagttgc tgggacccag cccgccctcg agagagcaaa cagagcggcg 2460 

ctcccctccc ccgaccccgg ccctttgtcc ggaatccagc tgtgctgcgg gggaggagcg 2520 

ggctcgcgtg gcgcggcccc agggccccgg cgctgattgg ccggtggcgc gggcagcagc 2580 

cgggcaggca cgctcctggc ccgggcgaag cagaCaaagc gtgccaaggg gcacacgact 2640 

tgctgctcag gaaatccctg cggtctcacc gccgcgcctc gagagagagc gtgacagagg 2700 

cctcggaccc cattctctct tcttttctcc tttggggctg gggcaactcc caggcggggg 2760 

cgcctgcagc tcagctgaac ttggcgacca gaagcccgct gagctcccca cggccctcgc 2820 

tgctcatcgc tctctattct tttgcgccgg tagaaaggta atatttggag gcctccgagg 2B80 

gacgggcagg ggaaagaggg atcctctgac ccagcggggg ctgggaggat ggctgttttt 294 0 

gttttttccc acctagcctc ggaatcgcgg actgcgccgt gacggactca aacttaccct 3000 

tccctctgac cccgccgtag g atg acg cct caa ccc teg ggt gcg ccc act 3051 

Met Thr Pro Gin Pro Ser Gly Ma Pro Thr 
15 10 

gtc caa gtg acc cgt gag acg gag egg tec ttc ccc aga gcc teg gaa 3099 
Val Gin Val Thr Arg Glu Thz Glu Arg Ser Phe Pro Arg Ala Ser Glu 

15 20 25 

gac gaa gtg acc tgc ccc acg tec gcc ccg ccc age ccc act cgc aca 3147 
Asp Glu Val Thr Cys Pro Thr Ser Ala Pro Pro Ser Pro Thr Arg Thr 

30 35 40 

egg ggg aac tgc gca gag gcg gaa gag gga ggc tgc cga ggg gcc ccg 3195 
Arg Gly Asn Cys Ala Glu Ala Glu Glu Gly Gly Cys Arg Gly Ala Pro 
45 50 55 

agg aag etc egg gca egg cgc ggg gga cgc age egg cct aag age gag 3243 
Arg Lys Leu Arg Ala Arg Arg Gly Gly Arg Ser Arg Pro Lys Ser Glu 
60 65 70 

ttg gca ctg age aag cag cga egg agt egg cga aag aag gcc aac gac 3291 
Leu Ala Leu Ser Lys Gin Arg Arg Ser Arg Arg Lys Lys Ala Asn Asp 
75 80 85 90 

cgc gag cgc aat cga atg cac aac etc aac teg gca ctg gac gcc ctg 3339 
Arg Glu Arg Asn Arg Met His Asn Leu Asn Ser Ala Leu Asp Ala Leu 

95 100 105 

cgc ggt gtc ctg ccc acc ttc cca gac gac gcg aag etc acc aag ate 3387 
Arg Gly Val Leu Pro Thr Phe Pro Asp Asp Ala Lys Leu Thr Lys lie 
110 115 120 

gag acg ctg cgc ttc ccc cac aac tac ate tgg gcg ctg act caa acg 3435 
Glu Thr Leu Arg Phe Ala His Asn Tyr lie Trp Ala Leu Thr Gin Thr 
125 130 135 

ctg cgc ata gcg gac cac age ttg tac gcg ctg gag ccg .ccg gcg ccg 3483 
Leu Arg lie Ala Asp His Ser Leu Tyr Ala Leu Glu Pro Pro Ala Pro 
140 145 150 

cac tgc ggg gag ctg ggc age cca ggc ggt tec ccc ggg gac tgg ggg 3531 
His Cys Gly Glu Leu Gly Ser Pro Gly Gly Ser Pro Gly Asp Trp Gly 
155 160 165 170 
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tec etc tac tec cca gtc tec cag get ggc age ctg agt ccc gee gcg 3579 

Ser Leu Tyr Ser Pro Val Ser Gin Ala Gly Ser Leu Ser Pro Ala Ala 

175 180 185 

teg ctg gag gag cga ccc ggg ctg ctg ggg gcc ace tct tec gee tgc 3627 

Ser Leu Glu Glu Arg Pro Gly Leu Leu Gly Ala Thr Ser Ser Ala Cys 

ISO 195 200 



ttg age cca ggc agt ctg get ttc tea gat ttt ctg tga aaggacctgt 
Leu Ser Pro Gly Ser Leu Ala Phe Ser Asp Phe Leu * 
205 210 



3676 



ctgtcgctgg gctgtgggtg ctaagggtaa gggagaggga gggagccggg agccgtagag 3736 

ggtggccgac ggccgcggcc ctcaaaagca cttgttcctt ctgcttctcc ctggctgacc 3796 

cctggccggc ccaggctcca cgggggcggc aggctgggtt cattccccgg ccctccgagc 3856 

cgcgccaacg cacgcaaccc ttgctgctgc ccgcgcgaag tgggcattgc aaagtgcgct 3916 

cattttaggc ctcctctctg ccaccacccc ataatctcat tcaaagaata ctagaatggt 3976 

agcactacec ggccggagcc gcccaccgtc ttgggtcgcc ctaccctcac tcaagtctgt 4036 

ctgcctctca gtctcttacc acccctcctc caatgtgatt caatccaatg tttggtctct 4096 

cagcgcttac tccccttgcc ttgctccaaa gacgctgccg atctgctcta ctcccaatca 4156 

ggtccgggat ttcagggcgc ctcactctgc cttaaagcca cgaaggcgac cctctgcctt 4216 

ctcctcgtgc acttttcgga gccattgccc tcccggggcg gaagaccagg ctgtgaactg 4276 

ggaaagcgct agcccggcca gggagcatct ccccagcctc cctgcgaact gcgcctgaaa 4336 

cgtgagctgc gctgcaggtg cctggagcac cgcgcatctt ttttttttaa atctgtttgt 4396 

aaattatatg atgccttttg aaatcaattt tggtacagta aaattatatg gcccctcccc 4456 

tgttttacac atttgtattt attaatgaga tttcacagca gggaaaagcc tatattttgg 4516 

atattagatt atttagggat tgctggatga catttaagcc aataaaaaaa aatggacctt 4576 

caagaagect tggcaagatg actccattgt gtgttgggga gaggagggcc acagtcacta 4636 

cagctgagga agagcacttc tgtccaaaga gagggatgac acrctttctg gaggtctggg 4696 

ctagagccag ggcagattgg gtttggagag ctggaagtct tctaagtaat tattggtcca 4756 

gctccctttt ttctatatag ggcaatgact ectcttattt caaagagtgg tttagaagaa 4816 

agacaagcct ccaactagga caactgactc tcacttgctg gccctttccc caactccacc 4876 

agcctagctt tagagcaact gttggttgca cttggggaag ggatacagta ataattcaat 4936 

tgcagagtca gagtcctegg aaacacggct gggctgggca tcctaggaat tttcccaagg 4996 

tgcttagagg cctagcaaat cccctgagca tattttactc cccaggcact gaggtggctg 5056 

tgtcgtgaac tccttgaact gagcagccag gagcaaagaa ggtggagcgt ctggctggaa 5116 

tatccagcaa cgccccctcc ctcatcacct ggcagccttg attgaaaact tattaagaaa 5176 

ctgttcaagg tttccagcca caccatgtct cttactggca aggtggaata ggactggtgc 5236 

agcatgagca ctgaaatctg tcccaggagt gccagtagag caccaetaca tgacttcagg 5296 

gacccctagg acctcagaga atatggtcta agctgtaagg atcc 5340 
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<210> 3 
<211> 1861 
<212> DNA 
<213> M. muscuius 

<220> 
<221> CDS 

<222> (1093) . . . (1737) 



<400> 3 

ggatcccaag gtgatattga acctggccaa gcaatagttt ctgagtagaa aggacttgag 60 

cagggaccgt ctctggtcac tctgtcctct ttcccaggat ggagtcagtc tgtgaaacat 120 

ggttgcacac acatttcctg acccaaccca tagtggcgga gagctggata gcactttgaa 190 

ctaatgggcg ctcctcccag ctgccagcca agaagacact tgactccttg atcgctggtt 240 

catttagaca agccgCttcc ctctctgagc caaaagaccc catgtgtaat actcaaagaa 300 

gaggccttcc ttatatatat ataggcaccc ccaaacctcc ttcatgctac caagaaaggg 360 

tctggacaca tgccaaaaag aaagaggaaa aggcaaagct ctccccagcg gccggacggg 420 

actcttctgg ctgggcgagg ctctttgagg aaccgagagt tgctgggact gagcccgcga 480 

cgggggaggc gtggagtggg ggaacaaaca gagtgctgct cccctccccc gacccctgcc 540 

ctttgtccgg aatccagctg tgctctgcgg gtgggggttg tggggggagg agcgggctcg 600 

cgtggcgcag cccctgggcc ccctccgctg attggcccgt ggtgcaggca gcagcccggc 660 

aggcacgctc ctggccgggg gcagagcaga taaagcgtgc caggggacac acgacttgca 720 

tgcagctcag aaatccctct gggtctcatc actgcagcag tggtcgagta cctcctcgga 780 

gcttttctac gacttccaga cgcaatttac tccaggcgag ggcgcctgca gtttagcaga 840 

acttcagagg gagcagagag gctcagctat ccactgctgc ttgacactga ccctatccac 900 

tgctgcttgt cactgactga cctgctcctc tctattcttt tgagtcggga gaactaggta 960 

acaattcgga aactccaaag ggtggatgag gggcgcgcgg ggtgtgtgtg ggggatactc 1020 

tggtcccccg tgcagtgacc tctaagtcag aggctggcac acacacacct tccatttttt 1080 

cccaaccgca gg atg gcg cct cat ccc ttg gat gcg etc acc ate caa gtg 1131 

Met Ala Pro His Pro Leu Asp Ala Leu Thr lie Gin Val 
1 5 ' 10 

tec cca gag aca caa caa cct ttt ccc gga ccc teg gac cac gaa gtg 1179 
Ser Pro Glu Thr Gin Gin Pro Phe Pro Gly Ala Ser Asp His Glu Val 
15 20 25 

etc agt tec aat tec acc cca cct age ccc act etc ata cct agg gac 1227 
Leu Ser Ser Asn Ser Thr Pro Pre Ser Pro Thr Leu He Pro Arg Asp 
30 35 40 45 

tgc tec gaa gca gaa gtg ggt gac tgc cga ggg acc teg agg aag etc 1275 
Cys Ser Glu Ala Glu Val Gly Asp Cys Arg Gly Thr Ser Arg Lys Leu 

50 55 60 

cgc gcc cga cgc gga ggg cgc aac agg ccc aag age gag ttg gca etc 1323 
Arg Ala Arg Arg Gly Gly Arg Asn Arg Pro Lys Ser Glu Leu Ala Leu 

65 70 75 
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age aaa cag cga aga age egg cgc aag aag gcc aat gat egg gag cgc 1371 

Ser Lys Gin Arg Arg Ser Arg Arg Lvs Lys Ala Asn Asp Arg Glu Arg 

80 85 ' 90 

aat cgc atg cac aac etc aac teg gcg ctg gat gcg ctg cgc ggt gtc 1419 

Asn Arg Met His Asn Leu Asn Ser Ala Leu Asp Ala Leu Arg Gly Vai 

95 100 105 

ctg ccc acc ttc ccg gat gac gcc aaa ett aca aag ate gag acc ctg 1467 

Leu Pro Thr Phe Pro Asp Asp Ma Lys Leu Thr Lys lie Glu Thr Leu 

110 115 120 125 

cgc ttc gcc cac aac tac ate tgg gca ctg act cag acg ctg cgc ata 1315 

Arg Phe Ala His Asn Tyr lie Trp Ala Leu Thr Gin Thr Leu Arg He 

130 135 140 

gcg gac cac age ttc tat ggc ccg gag ecc eet gtg ccc tgt gga gag 1563 

Ala Asp His Ser Phe Tyr Gly Pro Glu Pro Pro Val Pro Cys Gly Glu 

145 150 155 

ctg ggg age ccc gga ggt ggc tec aac ggg gac tgg ggc tct ate tac 1611 

Leu Gly Ser Pro Gly Gly Gly Ser Asn Gly Asp Trp Gly Ser He Tyr 

160 165 * 170 

tec cca gtc tec caa gcg ggt aac ctg age ccc acg gcc tea ttg gag 1659 

Ser Pro Val Ser Gin Ala Gly Asn Leu Ser Pro Thr Ala Ser Leu Glu 

175 180 185 

gaa ttc cct ggc ctg cag gtg ccc age tec cca tec taz ctg etc ccg 1707 

Glu Phe Pro Gly Leu Gin Val Pro Ser Ser Pro Ser Tyr Leu Leu Pro 

190 195 200 205 

gga gca ctg gtg ttc tea gac ttc ttg tga agagacctgt etggctctgg 1757 

Gly Ala Leu Val Phe Ser Asp Phe Leu * 

210 

gtggtgggtg ctagtggaaa gggaggggac cagagccgtc tggaetggga ggtagtggag 1817 

gctctcaagc atctcgcctc ttctggcttt caetacttgg atcc 1861 

<210> 4 

<211> 214 

<212> PRT 

<213> M. musculus 
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His Asn Tyr Ue Trp Ala Leu Thr Gin Thr Leu Arg lie Ala Asp His 

ser III Tyr Gly Pro Glu P.o P.o Val Pro Cys Gly Glu Leu Gly Ser 

150 iJ J 

Vrl Gly Gly Gly Se. Asn Gly Asp Trp Gly Ser He Tyr Ser Pro Val 

ser Gin Ala Gly Asn Leu Ser Pro Thr Ala Ser Leu Glu Glu Phe Pro 

Cly Leu Gin HI Pro Ser Ser Pro Ser Tyr Leu Leu Pro Gly Ala Leu 

195 

Val Phe Ser Asp Phe Leu 
210 
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ABSTRACT 



Neurogenic dilFercnUation genes and proteins are identified, 
isolated, and sequenced. Expression of neuroD has been 
demonstrated in neural, pancreatic, and gastrointestinal 
cells. Ectopic expression of neuroD in non-neuronal cells of 
Xenopus embryos induced formation of neurons. 
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^ tfntntivc oolvnuclcotidc molecules encoding members of 

NEUROGENIC DIFFERENTIATION SmoD S include neuroD, neuroD2 and ne«roD3. 

(NEUROD) GENES ^ represenlaUve nucleoade sequence encoding murine 

is a cooti«uation.in-p«rt of parent apphcadon U.S. Ser. No. ^ ^EQ^ mi- 1^ i„ SEQ ID N0:2. "nierc is a highly 

08/239,238, filed May 6, 1994 (abandoned). Zi^«l «rion foUowiflg the heUx-2 domain from amino 

r<^,T"J^T^TZ:Tr^^'tf ZTs^S^'^°oA .99of SBQIDNO:2thatis„ot 

a^t^^°v^er^l1e^..-in.c.^^^^^^^ .0 ^-^-^^^^^^^^ 

FIELD OFTHE mVENnON neuroD is shown in SEQ ID N0:3. The HLH coding donain 

^ . of Xenopus neuroD resides bebA-een nucleotides 376 and 
The invenUon relates to molecular biology and in par- id N0:3. The deduced amino add sequence of 

ticular to genes and proteins involved in vertebrate neural ,5 ^^^^ in SEQ ID N0:4. There is a 

development. highly conserved region fo"?*i°8 *e helfe-2 do^^^ 

^ . o«^^nf» arid 1*57 throufih amine aad 199 oi btiKi il* nvj.*t 

BACKGROUND OF THE mVEKnON amine -"^^J^^J''^^ ,hlH proteins. 

There axe cuHently several exan^les of transcription Representative nucleotide and deduced amine aad 

reSTtory proteins shiing a basic helix-loop-helix (bHLH^ 20 ^^^^ n,^oD family are ^hown ^ SHJ 

secondary structure. bHLH proteins form homodimenc and ^ n0S:8-15. Representative nucleotide and deduced 

homo^c complexes binding DNA in the 5' regulatoiy ^ sequences of a human homolog of rnmne 

Sn7rf^e»" controlling expression. Among Uie bHLH .^own in SEQ ID N0S;8 f " 9^=^=^ ^nom^c 

Sns^^inmalianMyoDandDrosophilaAS-Care pres- ^^ ^^^^^^ sEQ ID N0S:14 and 15 (human cDNA) 

Suvth^^^T^lay development « Representative nucleotide and deduced a^.^ 

opment Ld in Lnsc^ org^ development. «sped.vely ^^^^^^^^^ „f ,he human and murine neuroD2 ^ ^^J^ 

Bo^ proteins are thought to exert their eifects by bmduig 5' ^ nOS:10 and U. and 15 and 17, r^^lf^^^^ 

Story nucleotide S«iaences in genes that seem speafl- Representative nucleotide and deduced ^"""^ ^<=>^ 

cX determinaUve of cellular differentiaUon and fate. ^^,3 ^uman neuroD3 are "^^f .^^ 

Cever. the specific developmental roles of Oie genes 30 j^^^.^^ and 13. The disclosed 

XtedbyMyoDandAS-CremainlaigelyUBtoiown,asare ,„„esponding cDNA HC2A; now refened to as human 

fte molecVdetails of the developmental pathways rcgu- ^^^^^^ i4bj(„ow referred to l"^" 

lated by these genes. h»ve an Wentical HLH motif: amine acid residues 117^^^ 

nie presenUy disclosed NeuroD proteins represent a new sEQ ID NOrf) atid IS, and ["J.*^" 137-176 m SEQ IV 

tbatbinds totheinsuUn&box sequence. SMf* f^^f"^ ^Vofig J SEO ID N0:12) and Uat murine NeuroDl 

J Biochem. 229: 239-248, 1995) disclosed the isolation of 149-268 of SEQ '^•'^^ ^^ds residues 

a mouse HLH protein, MArH-2 that is d^^^^ ~ ^ s^ffi nTi7 Wonding .0 nucleotides 

tissue. Comparison of these sequences with the nenroD 138-177 of SEQ iv i^u k ^^^^ ^^.^^ 

sequences disclosed herein demonstxate that they are mem- ^2^^^'/^^^^^ '^^ NeuroD and human 

beis of the NeuroD family of proteins. V, rvi 

Neural tissues and endocrine tissues do not regenerate. NeuroUil. 
Damage is permanent Paralysis, loss of vision or hearing jj^j^p DESCRIPTION OF THE DRAWINGS 

and hormonal insufficiency are also P^'^'^'^b.^^''^ p.^, , schematically depicts the domain structure of the 

m^l'elnfxr^sUDbHUl proteins. 

therapeutic drugs may have on nervous tissues. The medical dEFAIUED DESCRimON OF THE 

community and public would greatly benefit from the avail- prefeRI^D EMBODIMENTS 

'^'^^S^Lrc^u'^ l^rn'eSISS „ Tissue-specific bHLH proteins that reflate early ne^ 
1::''"=^":; ctttnicUon of test ceU lioes^ r^d'^'^c^t.'^^^^^^ 

gene thera^ -d differentiaUon of tumor ceUs. ^^S'::';^'^!^^^^^^ residues 

S^MARVOPT^INVEKnOK i"irall^^^::;,^s^re« 
MammaUan and amphibian NctiroD Pro«™^ were e ^ ^^^^^3 sequences and pro- 

identifted, and polynucleoUde molecules encodmg NeuroD neurou, neui 

proteins were isolated and sequenced. "^'""^Se^Es^co^ ^ ttansiently expressed in diff^^ritbit. 

proteins that are distincUve ™ " ing ne^ons during embryogenesis. NeuroD is also detected 

addition, the present invention P'°'^'^''A}^^J^^I J Idult brain, in toe granule layer of the hippocampus and 
proteins that share a highly conserved HLH regiOD. Repre inaouiia 
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the cerebeUum. lo addidoo, murine ceuroD e:ipressios has molecules of the present invention may share greater iden> 

been detected in the pancreas and gastrointestinal tissues of tity at the araino acid level across highly conser\'ed regions 

developing embryos and post-natal mice. NeuroD contains such as the HLH domain. For example, the deduced araino 

the basic helix-loop-helix (bHLH) domain structure that has acid sequences of muriDe and Xenopus neuroD genes are 

been implicated in the binding of bHLH proteins to 5 96% identical. 

upstream recognition sequences and activation of dowj> in some instances, one may employ such changes in the 

stream target genes. The present invention provides repre- sequence of a recombinant ncuroD to substantially decrease 

scntative NeuroD proteins, which include the murine Neu- or even increase the biological activity of NeuroD relative to 

roDprc^ein of SEQ ID N0:2. the amphibian NeuroD protein the wild-type NcuroD activity, depending on the intended 

of SEQ ID N0:4, murine NeuroD2 protein of SEQ ID jq use of the preparation. Such changes may also be directed 

N0:17, human NeuroD protein of SEQ ID N0S:9 and 15, towards endogenous neuroD sequences using, for example, 

human NcuroD2 protein of SEQ ID N0:1U and human gene therapy methods to alter the gene product. 

NeuroD3 protein of SEQ D NO: 13. Based on homology The NeuroD proteins of the present invention are capable 

with other bHLH proteins, the bHLH domain for murine of inducing llie expression of neurooal-speciljc genes, such 

NeuroD is predicted to reside between araino acids 102 and ^5 as N-CAM, p-tubuUn, and Xeu-1 , neurofilament M (NF-M), 

155ofSEQIDNO:2, and between amino acids 101 and 157 Xen-2, tanabin-l, shakcr-1, and frog HSCL, in a frog 

of SEQ n> NO:4 for the amphibian NeuroD. embryo. As described below, NeuroD activity may be 

As detailed below, the present invention provides the deteaed when NeuroD is ectopically expressed in frog 

identification of human neuroD and, in addition, provides oocytes following, for example, injection of Xenopus neu- 

unexpected homologous genes of the same famHy based on 20 RNA into one of the two ccUs in a two- cell stage 

highly conserved sequences across the HLH domain shared Xenopus embryo, and monitoring expression of ncuronal- 

between the two human genes at the amino acid level specific genes in the injected as compared to un-injccted side 

(neuroD2 and neuroDS; SEQ ID NOS:10 and 11, and 12 and of the embryo by iramunochemistry or in situ hybridization. 

13, respectively). "Over-expression" means an increased level of NeuroD 

NeuroD proteins are transcriptional activators that control 25 protein or neuroD transcripts in a recombinant transformed 
transcription of downstream target genes that cause neuronal host cell relative to the level of protein or transcripts in the 
progenitors to differentiate into mature neurons. In the parental cell firom which the host cell is derived, 
neurula stage of the mouse embryo (elO), murine neuroD is As noted above, the present invention provides isolated 
highly expressed in the neurogenic derivatives of neural and purified polynucleotide molecules encoding NeuroD 
cxest cells, the cranial and dorsal root ganglia, aad postmi- 30 and other members of the NcuroD family. The disclosed 
totic cells in the central nervous system (CNS). During sequences may be used to identify and isolate ncuroD 
mouse development, neuroD is expressed transientiy and polynucleotide molecules from suitable host cells such as 
concomitant with neuronal differentiation in differentiating canine, ovine, bovine, caprine, lagoraoiph, or avian. In 
neurons in sensory organs such as in nasal epitheHum and paiticular, the nucleotide sequences encoding the HLH 
retina. Id Xenopus embryos ectopic expression of ncuroD in 35 region may be used to identify polynucleotide molecules 
Eon-neuronal cells induced formation of neurons. As dis- encoding other proteins of the NeuroD family. Con^lemen- 
cusscd in more detail below, NeuroD proteins are expressed tary DNA molecules encodiog NcuroD family members may 
in differentiating neurons and are capable of causing the be obtained by coDstructing a cDNA library raRNA from, for 
conversion of non-ncuronal cells into neurons. The present example, fetal brain, newborn brain, adult brain and larain 
invention encompasses NeuroD variants that, for example, 40 tissues. DNA molecules encoding NcuroD family members 
are modified in a manner that results in a NeuroD protein may be isolated from such a h'brary using the disclosed 
capable of binding to its recognition site, but unable to sequences in standard hybridization techniques (e.g., Sara- 
activate downstream genes. The present invention also brook et al., ibid., and BothweU,Yancopoulos and Alt, ibid.) 
encon^asses fragments of NeuroD that, for example, are or by amplification of sequences using pol>Tnerase chain 
capable of binding the natural NeuroD partner, but are 45 reaction (PCR) amplification (e.g, Loh ct al., Science 243: 
incapable of activating downstream genes. NeuroD proteins 217-222, 1989; Frohman ct al., Proc. Natl Acad ScL USA 
eaconipass proteins Fctneved from naturally occurring mate- 85: 8998-9002, 1988; and Erlich (ed.), PCR Technology: 
rials and closely related, functionally similar proteins Principles and Applications for DNA Amplification, SXock- 
retrieved by antisera specific to NeuroD, and recombinanUy ton Press, 1989; which are incorporated by reference herein 
expressed proteins encoded by genetic materials (DNA, 50 intheircntixety). In a similar manner, genomic DNA encod- 
RNA, cDNA) retrieved on the basis of their similarity to the ing NeuroD may be obtained using probes designed from the 
unique regions in the neuroD family of genes. sequences disclosed herein. Suitable probes for use in idcn- 

The present invention provides representative isolated tifying neuroD sequences may be obtained from neuroD- 

and purified polynucleotide molecules encoding proteins of specific sequences that are highly conserved regions 

the KcuroD fainiiy. Representative polynucleotide mol- 55 between mammalian and amphibian neuroD coding 

ccules encoding NeuroD include the sequences presented in sequences. Primers, for ex.-unple, from the region encoding 

SEQ ID N0S:1, 3, 8, 10, 12, 14, and 16. Polynucleotide the approximately 40 residues following die helix -2 domain 

molecules encoding NeuroD include those sequences result- are suitable for use in designing PCR primers. Alternatively, 

ing in minor genetic polymorphisms, differences between oligonucleotides containing specific DNA sequences from a 

species, and those that contain amino acid substitutions, 60 human ncuroD coding region may be used within the 

additions, and/or deletions. According to the present described methods to identify' related human neuroD 

invention, polynucleotide molecules encoding NeuroD genomic and cDNA clones. Upstream regulatory regions of 

encompass those molecules that encode NeuroD proteins ca" neuroD may t>e obtained using the same methods. Suitable 

peptides that share identity with the sequences shown in PCR primers arc between 7-50 nucleotides in length, more 

SEQIDNOS:2,4,9, 11, 13, 15, and 17. Such molecules will 65 preferably between 15 and 25 nucleotides in length, 

generally share greater than 35% identity at the amino acid Alternatively, neuroD polynucleotide molecules may be 

level with the disclosed sequences. The polynucleotide isolated using standard hybridization techniques with probes 
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of at least about 7 nucleotides in length and up to and Jncoiporated herein by reference in its entirety). It may be 

including the full coding sequence. Southern analysis of preferable to use a selectable marker to identify cells thai 

mouse genoinic DNA probed with the rnurine neuroD cDNA contain iJic cloned DNA. Selectable markers are generally 
under stringent conditions showed the presence of only one introduced into the cells along with the cloned DNA mol- 
gene, suggesting that under stringent conditions bHLH 5 ecules and include genes that confer resistance to drugs, 
genes from other protein families will not be identified. such as neomycin, hygromycin, and methotrexate. Select- 
Other members of the neuroD family can be identified using able markers may also complement auxotrophies in the host 
degenerate oligonucleotides based on the sequences dis- cell. Yet other selectable markers provide detectable signals, 
closed herein for PCR arq)Kfication or by hybridization at such as beta-galaaosidase to identify ceils containing the 
moderate stringency. cloned DNA molecules. Selectable markers may be ampli- 
The regulatory regions of neuroD may be useful as ^^^i^- Such arapliiiable selectable markers may be used to 
tissue-speciiic promoters. Such regulatory regions may find ampliiy the number of sequences integrated into the host 
use in, for exan:^le, gene therapy to drive the tissue-specific genome. 

expression of heterologous genes in pancreatic, As would be evident to one of ordinary skill in the art, the 

gastrointestinal, or neural cells, tissues or ccU lines. As 15 polynucleotide molecules of the present invention may be 

shown in Example 14, murine neuroD promoter sequences expressed in Saccharomyces cerevisiae, filamentous fungi, 

reside within the 1.4 kb 5' untranslated region. Regulatory and E coll Methods for expressing cloned genes in Saccha- 

sequences within this region are identified by comparison to romyces cerevisiae arc gencraUy known in the art (see, 

other promoter sequences and/or deletion analysis of the "Gene Expression Technology," Methods in Enzymolop, 

region itself. 20 Goeddel (ed,), Academic Press, San Diego, Calif., 

ADNAmolecule ceding a NeuroD protein is inserted into "Guide to Yeast Genetics and Molecular 

a suitable expression vector, which is in turn used to biology r Methods in Enzymology,GulYaiczn6Vmk{eds,\ 

transfect or transform a suitable host cell. Suitable cxpres- Academic Press, San Diego, Calif,, 1991; which are incor- 

sion vectors for use in carrying out the present invention poratcd herein by reference). Rlamcntous fungi may also be 

include a promoter capable of directing the transcription of 23 express the proteins of the present invention; for 

a polynucleotide molecule of interest in a host cell. Repre- example, strains of the fungi Aspergillus (McKnight et al., 

sentative expression vectors may include bothplasraid and/ ^-S. Pat. No. 4,935349, which is incoiporated herein by 

orviralvector sequences. Suitable vectors include retroviral reference). Methods for expressing genes and cDNAs in 

vectors, vaccinia viral vectors, C^fV viial vectors, BLUE- cultured mammalian cells and in E coli arc discussed iu 

SCRHTiM vectors, baculovinis vectors, and the like. Pro- 30 Sarabrook ct al. {Molecular Cloning: A Laboratory 

meters capable of directing the transcription of a cloned Manual, Second Edition, Cold Spring Harbor, N.Y., 1989; 

gene or cDNA may be inducible or constitufive promoters ^^^^ incoqx)rated herein by reference). As will be 

and include viral and cellular promoters. For expression in evident to one skilled in the art, one can express the protein 

maramaUan host cells, suitable viral promoters include the insUat iiivention in otlier host ceUs such as aviaji, 

immediate early cyloraegaiovirus promoter (Boshart et al., 35 ^i^sect, and plant cells using regulatory sequences, vectors 

Ce//41:521-530,1985)andtheSV40promoter(Subraniam and methods well established in the literature, 

et al., MoL Cell Biol 1 : 854-864, 1 98 1). Suitable cellular NeuroD proteins produced according to the present invcn- 

promoters for expression of proteins in mammalian host t^o" be purified using a number of established methods 

cells include the mouse raetailothionien-1 promoter such as afGnity chromatography using anti-NeuroD antibod^ 

(Palmiter et al., U.S. Pat. No. 4,579,821), a mouse Vk 40 ies coupled to a solid support. Fusion proteins of antigenic 

promoter (Bergman et al., Proc. Natl. Acad Sci. USA 81: ^6 and NeuroD can be purified using antibodies to the tag. 

7041-7045, 1983; Grant ct al. Nucleic Acid Res. 15: 5496, Additional purification may be achieved using conventional 

1987), and tetracydine-responsive promoter (Gossen and purification means such as liquid chromatography, gradient 

Bujard, Proc, Natl. Acad Sci. USA 89: 5547-5551. 1992. and centrifugation, and gel electrophoresis, among others. Meth- 

Pescini et al., Biochem. Biophys. Res. Comm. 202: 45 o^s of protein purification are known in the an (see 

1664-1667, 1994). Also contained in the expression vectors, generally, Scopes, R, Prvrein Purification, Springer- Verlag. 

typically, is a U^anscxiption termination signal located down- 1982. which is incorporated herein by reference) and 

stream of the coding sequence of interest. Suitable transcrip- may be applied to the purification of recombinant NeuroD 

tion termination signals include the early or late polyade- described herein. 

nylation signals firora SV40 (Kaufman and Shazp, Mot Cell so The term "capable of hybridizing under stringent condj- 

BioL 2: 1304-1319, 1982), the polyadenylation signal from tions'* as used herein means that the subject nucleic acid 

the Adenovirus 5 elB region, and the human growth hor- molecules (whether DNA or RNA) anneal to an oligonucle- 

mone gene terminator (DcNoto et al., Nucleic Acid Res. 9: otide of 15 or more contiguous nucleotides of SEQID NO:h 

3719-3730, 1981). MamraaHan cells, for example, may be SEQ ID N0:3, SEQ ID N0:8, SEQ ID NO:10, SEQ ID 
transfected by a number of methods including calcium 55 NO: 12, SEQ ID NO: 14 or SEQ ID NO: 16. 

phosphate precipitation Wigler el al., Ceil 14: 725, 1978; The dioice of hybridization conditions will t>e evident to 

Corsaro and Pearson, Somatic Cell Genetics 7: 603, 1981; odc skilled in the art and wUl generally be gnidcd by the 

Graham and Van dcr Eb, Virology 52: 456, 1973), purpose of the hybridization, the type of hybridization 

lipofeaion, microinjection, and eiectropotation (Neumann (DNA-DNA or DNA-RNA), and the level of desired relat- 
etal.,£MBC> J. 1: 8410S45, 1982). Mammalian ceUs can be 60 edness between the sequences. Methods for hybridization 

transduced with viruses such as SV40,CMV, and the like. In are well established in the iiteramre. See, for example: 

the case of viral vectors, cloned DNA molecules may be Sambrook et al., ibid.; Hames and Higgins, eds, Nucleic 

introduced by infection of suscepdble cells with vital par- Acid Hybridization, A Practical Approach, IRL Press, Wash- 

ticles. Retroviral vectors may be preferred for use in ingtoa D.C., 1985; Berger and Kimmel, eds. Methods in 
expressing NeuroD proteins in raanmialian cells particularly 65 Enzymolosyy Vol 52, Guide to Molecular Cloning 

if NeuroD is used for gene therapy (for review, see, MiUer Techniques, Academic Press Inc., New York, N.Y., 1987; 

et al. Methods in Enzymology 2 17: 581-599, 1994; which is and BothwcU, Yancopoulos and Alt, eds. Methods for Clon- 
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ing and Analysis of Eukaryotic Genes, Jones and Bartlctt 
Publishers, Boston, Mass. 1990; which are incoiporated by 
reference herein in their entirety. One of ordinary skUi in the 
an realizes that the stability of nucleic acid duplexes will 
decrease with an increased number and location of inis- 
matched bases; thus, the stringency of hybridization may be 
used to maximize or minimize the stability of such duplexes. 
Hybridization stringency can be altered by: adjusting the 
temperature of hybridization; adjusting the percentage of 
helix-destabilizing agents, such as formamide, in the hybrid- 
ization mix; and adjusting the temperature and salt concen- 
tratioQ of the wash solutions. In general^ the stringency of 
hybridization is adjusted during the post-hybridization 
washes by varying tiic salt concentration and/or the tem- 
perature. Stringency of hybridization may be reduced by 
reducing the percentage of formainide in the hybridization 
solution or by decreasing the temperature of the wash 
solution. High stringency conditions may involve high tem- 
perature hybridization (e.g., 65°-68° C- in aqueous solution 
containing 4-6x SSC, or 42** C. in 50% formamide) com- 
bined with high temperature (e.g., 5'*-25^ C. below the T J 
and a low salt concentration (e.g., O.lx SSC). Reduced 
stringency conditions may involve lower hybridization tem- 
peratures (e.g., 35°-42'' C. in 20-50% fornMraide) witii 
intermediate temperature (e.g., 40°-60'' C.) and washes in n 
higher salt concentration (e.g., 2~6x SSC). Moderate strin- 
gency conditions, which may involve hybridization at a 
temperature between 50"* C. and 55° C. and washes in O.lx 
SSC, 0.1% SDS at between 50° C. and 55° C, may be used 
to identify clones encoding members of the NeuroD family. 

The invention provides isolated and purified polynucle- 
otide molecules encoding NeuroD proteins that arc capable 
of hybridizing under stringent conditions to an oligonucle- 
otide of 15 or more contigijous nucleotides of SEQ ID N0:1, 
SEQ ED NO-3, SEQ ID N0:8, SEQ ID NO: 10, SEQ ID 
N0:12, SEQ ID NO: 14. and/or SEQ ID N0:16, including 
theft conqjlementary strands. The subject isolated neuroD 
polynucleotide molecules preferably encode NeuroD pro- 
teins that trigger differentiation iii ectodermal cells, paiticu- 
larly neuroectodermal stem cells, and in more committed 
C5clls of that lineage, for example, epidermal precursor ceUs, 
pancreatic and gastrointestinal cells. Such neuroD expres- 
sion products typically form heterodimeric bHLH protein 
complexes that bind in the 5'-rcgulatory regions of target 
genes and enhance or suppress transcription of the target 
gene. 

In some instances, cancer cells may contain a non- 
functional NeuroD protein or may contain no NeuroD 
protein due to genetic mutation or somatic mutations such 
that these cells fail to differentiate. For cancers of this type, 
the cancer cells may be treated in a manner to cause the 
over-expression of wild-type NeuroD protein to force dif- 
ferentiation of the cancer cells. 

Anti sense neiiroD nucleotide sequences may be used to 
block expression of mutant neuroD expression in neuronal 
precursor cells to generate and harvest neuronal stem cells. 
The use of antisense oligonucleotides and their applications 
have been reviewed in the literature (see, for example, Mol 
and Van dcr Krul, eds., Antisense Nucleic Acids and Proteins 
Fundamentals and Applications, New York, N-Y., 1992; 
which is incorporated by reference herein in its entirety). 
Suitable antisense oligonucleotides are at least 11 nucleotide 
in length and may include untranslated (upstream or intron) 
and associated coding sequences. As will be evident to one 
skilled in tiie art, the optimal length of an anti sense 
oligonucleotide depends on the strength of the interaction 
between the antisense oligonucleotide and the coraplemen- 
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tary mRNA, the temperature and ionic environment in 
which translation takes place, the base sequence of the 
antisense oligonucleotide, and the presence of secondary 
and tertiary structure in the raRNA and/or in the antisense 
oligonucleotide. Suitable target sequences for antisense oli- 
gonucleotides include intron-exon junctions (to prevent 
proper splicing), regions in which DNATRNA hybrids will 
prevent transport of mRNA from the nucleus to the 
cytoplasm, initiation factor binding sites, ribosome binding 
sites, and sites that interfere wiUi ribosome progression. A 
particularly preferred target region for antisense oligonucle- 
otide is the 5' unfranslatcd (promoter/enhancer) region of the 
gene of interest. Antisense oligonucleotides may be prepared 
by the insertion of a DNA molecule containing the target 
DNA sequence into a suitable expression vector such that 
the DNA molecule is inserted downstream of a promoter in 
a reverse orientation as compared to the gene itself. The 
expression vector may then be transduced, transformed or 
transfected into a suitable cell resulting in the expression of 
antisense oligonucleotides. Alternatively, antisense oligo- 
nucleotides may be synthesized using standard manual or 
automated synthesis techniques. Synthesized oligonucle- 
otidcs may be introduced into suitable ceils by a variety of 
means including electroporation, calcium phosphate 
precipitation, or microinjection. The selection of a suitable 
antisense oligonucleotide administration method will be 
evident to one skilled in the art. With respect to sjTiUiesizcd 
oligonucleotides, the stability of antisense oligonucleotide- 
mRNA hybrids may be increased by the addition of stabi- 
lizing agents to the oligonucleotide. Stabilizing agents 
include intercalating agents that are covalenUy attached to 
either or both ends of the oligonucleotide. Oligonucleotides 
may be made resistant to nucleases by. for example, modi- 
fications to the phosphodiester backbone by the introduction 
35 of phosphotriesters, phosphonates, phosphorothioates, 
phosphoroselcnoatcs, phosphoramidates, or phospho- 
rodithioates. Oligonucleotides may also be made nuclease 
resistant by synthesis of the oligonucleotides with alpha- 
anoraers of the deoxyribonucleolides. 
40 NeuroD binds to 5' regulatory regions of neurogenic 
genes that are involved in neuroectodermal differentiation,, 
including development of neural and endocrine tissues. As 
described in more detail herein, murine neuroD has been 
detected in neuronal, pancreatic and gastrointestinal tissues 
45 in embryonic and adult mice suggesting that NeuroD func- 
tions in the transcription regulation in these tissues. NeuroD 
proteins alter the expression of subject genes by, for 
example, down-regulating or up-regulating transcription, or 
by inducing a change in transcription to an alternative open 
50 reading frame. The subject polynucleotide molecules find a 
variety of uses, e.g., in preparing oli^onucleoUde probes, 
expression vectors, and transformed host cells, as disclosed 
below in the following Examples. 

DNA sequences recognized by NeuroD may be deter- 
55 mined using a number of methods known in the literature 
including iraraunoprecipitation (Biedenkapp et al, Nature 
335: 835-837, 1988; Kinzler and Vorgelstein, Nuc, Acids 
Res. 17: 3645-3653, 1989; and Son^ayrac and Dauna, Proc, 
Natl. Acad Sci. USA 87: 3274-3278, 1990; which arc 
60 incorporated by reference herein), protein affinity colimins 
(OUphant et at., MoL Cell. BioL 9: 2944-2949, 1989; which 
is incorporated by reference herein), gel mobility shifts 
(Blackweil and Weintiaub, Science 250: 1104-1110, 1990; 
which is incorporated by reference herein), and Sonthwest- 
65 ern blots (Keller and Maniatis, Nuc. Acids Res, 
17:4675-4680, 1991; which is incorporated by reference 
herein). 
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One embodiment of the present invention involves the romodulatoiy factors. Cells tliat can be used for the purpose 
construction of interspecies hybrid NcuroD proteins and of modulation of gene expression by NeuroD include cells 
hybrid NeuroD proteins containing one or more domains of the neuroectodermal lineage, glial cells, neural crest cells, 
from another NeuroD family member to facilitate structure- and epidermal epithelial basal stem cells, and all types of 
function analyses or to alter NeuroD activity by increasing 5 both mesodermal and cndoderraal lineage cells, NeuroD 
or decreasing the transcriptional activation of neurogenic expression may aJso be used within methods that induce 
genes by NeuroD relative to the wild-type NeuroD(s). exfffcssion of genes associated with pancreatic and gas- 
Hybrid proteins of the present invention may contain the trointestinal phenotype. Examples of such gene expression 
replacement of one or more contiguous amino acids of the include insulin expression, and gastrointestinal-specific 
native NeuroD with the analogous amino acid(s) of NeuroD jq enzyme expression. 

from another species or other protein of the NeuroD family. As illustrated in Example 10, the expression of Xenopus 

Such interspecies or interfaraay hybrid proteins include NeuroD protein in stem cells causes redirection of epidermal 

hybrids having whole or partial domain replacements. Such cell differentiation and induces terminal differentiation into 

liybrid proteins are obtained using recombinant DNA tech- neurons, i.e., instead of epidermal ccUs. Epithelial basal 

niques. Briefly, DNA molecules encoding the hybrid Neu- "^^^ ' '° mucosal tissues) are one of the 

roD proteins of interest are prepared using generally avail- continuously regenerating cell types in an adult mam- 

ahie methods such as PCR mutagenesis, site^ectcd mal Introduction of tlie subject nucleoUdcscque 

mutagenesis, and/or restriction digesUon and ligation. TTie 'P^^^^^ ^^'^ f ?! ^"^^"^P^^^^.^^.^" ^ ^ 

1. ^xr/ - .v'-j-. • . J VIVO using a suitable gene therapy vector delivery system 

nynna uina is men mserteo in o expression vectors_ ana , ^ ^ -j^^^j^ 

vector), a micromiection techmque (see, 

mtroduced mto smUble host ceUs. The biologacal activity 20 for example, Tam. Basic Ufe Sciences 37: 187-194, 1986, 

may be assessed essentially as described in the assays set ^jj^^h is incorporated by reference herein in its entirety), or 

forth m more detail in the Examples that follow. ^ transfection method (e.g.. naJced or liposome encapsulated 

The invention also provides synthetic peptides, rccombi- DNA or RNA; see, for example, Trends in Genetics 5: 138, 

nantly derived peptides, fusion proteins, and the Hkc that 1989; Chen and Okayama, Biotechniques 6: 632-638, 1988; 

include a portion of NeuroD or the entire protein. The 25 Mannino and Gould-Fogerite. Biotechniques 6: 682-690, 

subject peptides have an amino acid sequence encoded by a 1988; Kojima et al., Biochem, Biophys. Res. Comnu 207: 

nucleic acid which hybridizes under stringent conditions 8-12, 1995; wliich are incorporated by reference herein in 

with an oligonucleotide of 15 or more contiguous nude- their entiret>')- The introduction method may be chosen to 

otides of SEQ ID NO: 1, SEQ ID N0:3, SEQ ID N0:8, SEQ achieve a transient expression of NeuroD in the host cell, or 

ID NO: 10, SEQ ID NO:12, SEQ ID NO: 14, or SEQ ID 30 »uay be preferable to achieve constitutive or regulated 

NO:16. Representative amino acid sequences of the subject expression in a tissue specific manner, 

peptides are disclosed in SEQ ID NO:2,SEQIDNO:4, SEQ TVansformcd host cells of the present invention find a 

ID N0:9, SEQ ID N0:11, SEQ ID NO: 13, SEQ ID NO: 15, variety of in vitro uses, for example: i) as convenient sources 

and SEQ ID NO: 17. The subject peptides find a variety of of neuronal and other growth factors, ii) in transient and 

uses, indudmg preparation of specific antibodies and prepa- 35 continuous cultures for screening anti-cancer drugs capable 

ration of antagonists of Nciu-oD activity. of driving terminal differentiation in neural tumors, iii) as 

As noted above, tije invention provides antibodies that sources of recombinantiy expressed NeuroD protein for use 
bind to NeuroD. The production of non-human antisera or as an antigen in preparing monoclonal and polyclonal anti- 
monoclonal antibodies (e.g., murine, lagomorph, porcine, bodies useful in diagnostic assays, and iv) in transient and 
equine) is weU known and may be accomplished by, for 40 continuous cultures for screening for compounds capable of 
example, immunizing an animal with NeuroD protein or increasing decreasing the activity of NeuroD. 
peptides. For the production of jnonoclooal antibodies, Transformed host cells of the present invention also find 
antibody producing cells are obtained from immunized a variety of in vivo uses, for example, for transplantation at 
animals, immortalized and screened, or screened first for the sites of traumatic neural injury where motor or sensory 
production of the antitx>dy that binds to the NeuroD protein 45 neural activity has been lost. Representative patient popu- 
or peptides and then immortalized. It may be desirable to lations that may benefit from transplantation include: 
transfer the antigen binding regions (e.g., F(ab')2 or hyper- patients with hearing or vision loss due to optical or auditory 
variable regions) of non-human antibodies into the frame- nerve damage, patients with peripheral nerve damage and 
work of a human antibody by recombinant DNA techniques loss of motor or sensory neural activity, and patients with 
to produce a substantially human molecule. Methods for 50 brain or spinal cord damage from traumatic injur)'. For 
producing such "humanized" molecules are generally well example, donor cells from a patient such as epithelial basal 
known and desCTibed in, for example, U.S. Pat. No. 4,816, stem cells are cultured in vitro and then transformed or 
397; which is incorporated byreference herein in its entirety. transduced with a ncuroD nucleotide sequence. The trans- 
Altcmatively, a human monoclonal antibody or portions formed cells are then returned to the patient by raicroinjec- 
thereof may be identified by first screening a human B-ccll 55 tion at the site of neural dysfunction. In addition, trans- 
cDNA library for DNA molecules that encode antibodies formed host cells of the present invention may be useful for 
that specifically bind to NeuroD, e.g., according to the transplantation into patients with diabetes. For example, 
method generally set forth by lluse et al. {Science 246: donor cells from a patient such as fibroblasts, pancreatic islet 
1275-1281, 1989, which is incorporated by reference herein cells, or other pancreatic cells arc harvested and transformed 
in its entirely). The DNA molecule may then be cloned and 60 transfected with a neuroD nucleotide sequence, 'llic 
amplified to obtain sequences that encode the antibody (or genetically engineered cells are then returned to the patienL 
binding domain) of the desired specificity. In anotiier embodiment, such engineered host cells may find 

The invention also provides methods for inducing the "se in the treatment of malabsorption syndromes, 

expression of genes associated with neuronal phenotype in Representative uses of the nucleotide sequences of the 

a cell that docs not normaUy express those genes. Examples 65 invention include the following: 

of neuronal phenotypes that may be modulated by NeuroD 1. Consdruction of cDNA and oligonucleotide probes 

expression include expression of neurotransmitters or neu- useful in Northern, Southern, and dot-blot assays for iden- 
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tifying and quantifying the level of expression of neuroD in coding region of the vector under the control of a promoter, 

a cell. High level expression of neuroD in neuroendocrine neuroD gene therapy may be used to correct traumatic 

tumors and in rapidly proliferating regions of embryonic neural injury that has resulted in loss of motor or sensory 

neural development (sec below) indicates that measuring the neural function, and also for the treatment of diabetes. For 

level of neuroD expression may provide prognostic markers 5 these therapies, gene transfer vectors may either be injected 

for assessing the growth rate and invasiveness of a neural directly at the site of the traumatic injury, or the vectors may 

tumor. In addiUoD. considering the important role of NeuroD be used to construct transformed host cells that are then 

in cmbiyomc devciopraent it is thought highly likely that injected at the site of the traumatic injury. The results 

Inrth defects and abortions may result from expression of an disclosed in Example 10 indicate that introduction of neu- 

abnormal NeuroD protan. In this case, NeuroD may prove ;r.A,.^ 1 n * 1 rrn . 

t.- ui *: 1 • * 1 - I ^uiv/j./ tuay |auvc j-^j) mduccs E nou-neuTonal cell to become a neuron. This 

mgwy useim in prenaiaJ screening at mothers and/or for m discovery raises for the first time the possibility of using 

« ^ Jf ^. r . 1, transplantation and/or gene therapy to repair neural defects 

2. Construction of recombmant cell lines, ova, and trans- • • t j-^ J7 

*,«Kr„«o ^r.A ,.^1,,^' A ' t resultuig from trauraaUc injury. In addiaon, the discovery of 

genic embryos and anmiais mcluding dommant-ncgativc „«,irr.n\w^,«-^-c ♦uo ««««iu:iL, -J 

and "knock-out" recombinant cell lines in which the tran- f^^^^ f ^Z*^' possibility of providing specific gene 

scription regulatory activity of NeuroD protein is down- ^^^^Py the treatment of certain neurological disorders 

regulated or eliminated. Such cells may contain altered ^ Alzhemier s disease, Huntington's disease, and 

neuroD coding sequences that result in the expression of a Parkinson's disease, in which a population of neurons have 

NeuroD protein that is not capable of enhancing, suppress- damaged Two basic methods of neuroD utilizaUon arc 

ing or activating transcription of the target gene. The subject envisioned in this regard. In one method, neuroD is 
cell lines and animals find uses in screening for candidate 20 expressed in existing populations of neurons to modulate 

therapeutic agents capable of either substituting for a fimc- aspects of their neuronal phenotype (e.g., neurotransmitter 

tion performed by NeuroD or correcting the cellular defect expression or synapse targeting) to make the neurons 

caused by a defective NeuroD. Considering the important express a factor or phenotype to overcome the deficiency 

regulatory role of NeuroD in embryonic development, birth that contributes to the disease. In this method, recombinant 

defects may occur from expression of mutant NeuroD 25 neuroD sequences are introduced into existing neurons or 

proteins, and these defects may be correctable in utero or in endogenous neuroD expression is induced. In another 

early post-natal life through the use of compounds identified method, ncuroD is expressed in non-neuronal cells (e.g., 

in screening assays using NeuroD. In addition, neuroD glial ceUs in the brain or another non-ncuronal cell type such 

polynucleotide molecules may be joined to reporter genes, as basal epithelial ceUs) to induce expression of genes that 

such as p-galactosidase or luciferase, and inserted into the 30 confer a complete or partial neuronal phenotype that ame- 

genome of a suitable embryonic host cell such as a mouse Horates aspects of the disease. As an example, Parkinson's 

embryonic stem ceU by, for example, homologous recom- disease is caused, at least in part, by the death of neurons that 

bination (for review, see Capecchi, Trends in Genetics 5: supply the neurotransmitter dopamine to the basal ganglia 

70-76, 1989; which is incorporated by reference). Cells Increasing the levels of neurotransrairter ameliorates the 

expressing NcuroD may then be obtained by subjecting the 35 symptoms of Parkinson's disca.sc. Expression of neuroD in 

differentiating embryomc ceUs to ceU sorting, leading to the basal ganglia neun)ns or glial cells may induce aspects of a 

purification of a population of neuroblasts. Neuroblasts may neuronal phenot)^^ such that the neurotransmitter dopamine 

be useful for studying neuroblast sensitivity to growth is produced direcUv in tliese cells. It may also be possible to 

factors or cheraotherapeutic agents. The neuroblasts may express ncuroD in donor cells for transplantation into the 

also be used as a source from which to purify specific protein ^ affected region, either as syngeneic or allogeneic transplan- 

products or gene transcripts. These products may be used for tations. Within yet another embodiment, neuroD is 

the isolation of growth factors, or for the identification of expressed in non-pancreatic cells to induce expression of 

cell surface markers that can be used to purify stem cell genes that confer a complete or partial pancreatic phenotype 

population firom a donor foe transplantation. that araeUorates aspects of diabetes. Within yet another 

As illustrated in Example 14, "knock-out** mice were 45 embodiment, neuroD is expressed in pancreatic islet cells to 

generated by replacing the murine neuroD coding region induce expression of genes that induce the expression of 

with the p-galactosidase reporter gene and the neomycin insulin. 

rcsistanccgenetoa.ssesstheconsequeiioesofdiminatingthe 4. Preparation of transplantable recombinant neuronal 

murine NeuroD protein and to examine the tissue distribu- precursor ceU populations from erabrvonic ectodermal ceUs, 

tion of NeuroD in fetal and postnatal mice. Mice that were 50 non-neural basal stem cells, and the'like. Establishing cul- 

homozygous for die mutation Qacking NeuroD) had tures of non-malignant neuronal ceUs for use in therapeutic 

diabetes, as demonstrated by high blood glucose levels, and screening assays has proven to be a difficult task. The 

died by day four. Homozygous mutants had blood glucose" isolated polynucleotide molecules encoding NeuroD of the 

levels between 2 and 3 times the blood glucose level of present invenUon permit Uie estabUshment of primary (or 

wild-type mice. Heterozygous mutants exhibited similar 55 continuous) cultures of proliferating embryonic neuronal 

blood glucose levels as wUd-type mice. Examination of stem cells under conditions mimicking those that are active 

stained tissue from fetal and postnatal mice heterozygous for in development and cancer. The resultant cell lines find uses- 

the mutation confirmed the NcuroD expression pattern in i) as sources of novel neural growth factors, ii) in saeening 

neuronal ceUs demonsUated by in situ hybridization assays for anti-cancer compounds, and iii) in assays for 

(Example 4) and also demonstrated neuroD expression in identifying novel neuronal growth factors. High level 

the pancreas and gastrointestinal tract. expression of ncuroD in the embryonic optic rectum (see 

^ "Knock-out** mice may be useful as a model system for below) indicates that NeuroD protein may regulate expres- 

diabetcs. Such mice may be used to study methods to rescue sion of factors n^ophic for growing retinal cells. Such cells 

homozygous mutants and as hosts to test transplant tissue for may be useful sources of growth factors, and may be u seful 

treating diabetes. ^5 in screening assays for candidate therapeutic compounds. 

3- Construction of gene transfer vectors (e.g., retroviral Tlie cell lines and transcription regulatory factors dis- 

vcctors, and the like) wherein neuroD is inserted into the closed herein oflfer the unique advantage thai since they are 
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active very early in embryonic differenlialion they lepresenl 
potential switches, e.g., ON->OFF or OFF-^ON, control- 
ling subsequent ceLl fate. If the switch can be shown to be 
reversible (i,e., ON -OFF), the NcuroD transcription regu- 
latory factor and neuroD nucleic acids disclosed herein 
provide exciting opportunities for restoring lost neural and/ 
or endocdne functions in a subject. 

The following examples are offered by way of illustration 
and not by way of limitation. 



EXAMPUE 1 

Construction of the embryonic stem cell 
cDNA libray 
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A continuous murine embryonic stem cell line (i.e., the ES 
cell line) having mutant E2A (the putative binding partner of 
myoD) was used as a cell source to develop a panel of 
embryonic stem cell tumors. Recombinant ES stem cells 
were construrted (i,e., using homologous recombination) 
wherein both alleles of the putative myoD binding partner 
E2A were replaced with diug-selectable marker genes. ES 
cells do not make functional E12 or E47 proteins, both of 
which are E2A gene produas. ES cells form subcutaneous 
tumors in congenic mice (i.e., 129J) that appear to contain 
representatives <rf many different embryonal cell types as 
judged histologically and through the use of RT-PCR gene 
expression assays. Individual embryonic stem cell tumors 
were induced in male 129J strain mice by subcutaneous 
injection of 1x10'' cells/site. Three weeks later each tumor 
was harvested and used to prepare an individual sample of 
RNAs. Following random priming and second strand syn- 
thesis the ds-cDNAs were selected based on their size on 
0.7% agarose gels and those cDNAs in the range of 400-800 
bp were ligated to either Bam HI or Bgl H linkers. (Linkers 
were used to minimize tlie possibility that an internal Bam 
HI site in a cDNA might inadvertently be cut during cloning, 
leading to an abnormally sized or out~of-framc expression 
product.) The resultant individual stem cell tumor DNAs 
were individually ligated into the Bam HI cloning site in the 
*'fl-VP16" 2)1 yeast expression vector. This expression 
vector, fl-VP16, contains the VP16 activation domain of 
Herpes simplex virus (HSV) located between Hind M (HHI) 
and Eco RI (RI) sites and under the control of the Saccha- 
romyces cereviseae alcohol dehydrogenase promoter; with 
LBU2 and Ampidllin-resistance selectable markers. Inser- 
tion of a DNA molecule of interest into the Hind HI site of 
the fl-VP 16 vector (i.e., 5' to the VP 16 nucleotide sequence), 
or into a Bam HI site (i.e., 3' to the VP16 sequence but 5* to 
the Eco RI site), results in expression of a VP 16 fusion 
protein having the protein of interest joined in -frame \sith 
VP 16. The resultant cDNA library was termed the 179- 
library". 

EXAMPLE 2 

Identiiication and cDNA cloning of neuroD 

A two-hybrid yeast screening assay was used essentially 
as described by Fields and Song (Nature 340: 245, 1989) and 
modified as described herein was used to screen the 179- 
library described in Example 1 . Yeast rs\'o-hybrid screens are 
reviewed as disclosed in Fields and Sternglanz (Trends in 
Genetics 10: 286-292, 1994). The library was screened for 
cDNAs that interacted with LexA-Da, a fusion protein 
between the Drosophtla Da (Daughterless) bllLH domain 
and the prokaryotic LexA-DNA binding domain. Multim- 
erized LexA binding sites were cloned upstream of two 
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reporter genes, (he HIS3 gene and the p-galactosidase gene. 
The S. cereviseae strain L40 containing a plasmid encoding 
the LexA-Da fusion protein was transformed with CsCl 
gradient-punhed fi-VP16-179-cDNA library. Transformants 
were maintained on medium selecting both plasmids (the 
LexA-Da plasmid and the cDNA library plasmid) for 16 
hours before being subjected to histidine selection on plates 
lacking histidine, leucine, tryptophan, uracil, and lysine. 
Qones that were HIS"^ were subsequently assayed for the 
expression of LacZ. To eliminate possible non-specific clon- 
ing artifacts, plasmids from HIS'TLacZ* were isolated and 
transformed into S. cereviseae strain LAO containing a 
plasmid encoding a LexA-Lamin fusion. Clones that scored 
positive in the interaction with lajnin were discarded. 
Approximntely 400 cDNA clones, which represented 60 
diJfferent transcripts, were identified as positive in these 
assays. Twenty-five percent of the original clones were 
subsequently shown to be known bHLH genes on the basis 
of their reactivity with specific cDNA probes. One cDNA 
clone encoding a VP16-fusion protein that interacted with 
Da but not lamin was identified as unique by sequence 
analysis. This clone, initially termed tango, is now referred 
to as neuroD. 

The unique cDNA identified above, VP 16- neuroD, con- 
tained an approximately 450 bp insert that spanned the 
bHLH region. Sequence analysis showed that the clone 
contained an insert encoding a complete bllLIl amino acid 
sequence motif that was unique and previously unreported. 
Further analysis suggested that while the cDNA contained 
conserved residues common to all members of the bHLH 
protein family, several residues were unique and made it 
distinct from previously identified bHLH proteins. The 
ncuroD cDNA insert was subcloned as a Bam HI-Not I insert 
into Bam HI-Not I linearized pBlucscripl SK^. The resulting 
plasmid was designated pSK+1-83. 

The neuroD insert contained in the VP16-neuroD plasmid 
was used to re-probe a mouse cDNA library prepared from 
mouse embryos at developmental stage el 0.5. Candidate 
clones were isolated and sequenced essentially as described 
above. Several clones were isolated. One clone, designated 
pKS'^ra7a RX, was deposited at the American Type (Mtiire 
Collection, 12301 Parklawn Drive, Roclcville, Md. 20852 
USA. on May 6, 1994, under accession number 75768, 
Plasmid pKS m7a RX contains 1646 bp of murine neuroD 
cDNA as an EcoRI-XhoI insert. The amino add sequence 
encoded by the insert begins at amino acid residue +73 and 
extends to the carboxy-terminus of the NeuroD protein. The 
plasmid contains about 855 bp of NeuroD coding sequence: 
(encoding amino adds 73-536). 

None of the mouse cDNAs contained the complete 5' 
coding sequence. To obtain llie 5' ncuroD coding sequence, 
a mouse strain 129/Sv genomic DNA library was screened 
with the VP16-neuroD plasmid insert (450 bp). Genomic 
clones were isolated and sequenced and the sequences were 
aligned with the cDNA sequences. Alignment of the 
sequence aud comparison of the genomic 5' coding 
sequences with the Xenopus neuroD clone (Example 8) 
confirmed the 5' neuroD coding sequence. The complete 
neuroD coding sequence and deduced amino acid sequence 
arc shown in SEQ ID NOS: 1 and 2. 

EXAMPLES 



65 



NeuroD/neuroD 

bHLH proteins share common structural similarities that 
include a basic region that binds DNA and an HLH region 
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- mvolved-in protein-protein interactions required for ^Iwy^ i^J^gence of more differentiating neurons at this stage. At this 
foimatioa of homoduners and heterodimeiic compIexes/X ' stage neuroD expression is also observed in other sensory 

comparison of the amino add sequence of the basic region of organs in which neuronal differentiation occurs, for 

murine NeuroD (amino acids 102 to 113 of SEQ ID N0:2) example, in the nasal epitlieliura, otic vesicle, and retina of 

with basic regions of other bHLH proteins revealed that 5 the eye. In both of these organs neuroD expression was 

murine NeuroD contained all of the conserved residues observed in the region containing differentiating neurons, 

characteristic among this family of proteins. However, in" in the el 4.5 mouse embryo, expression of neuroD was 

addition, NeuroD contained several unique residues. These observed in cranial ganglia and DRG, but expression of 

unique amino acid residues were not found in any other neuroD persisted in the neuronal regions of developing 

known HLH making NeuroD a distinctive new member of lo sensory organs and the central nervous system (CNS). Thus, 

the bHLH family. The NARERNR basic region motif in neuroD expression was observed to be transient during 

NeuroD (amino adds 107-113 of SEQ ID N0:2) is also neuronal development 

found in the Drosophila AS-C protein, a protein thought to summarv, expression of neuroD in the neurula stage of 

be involved in neurogenesis. Similar, but not identical, embryo (elO), in the neurogem'c derivatives of neural 

NARERRR and NERERNR motifs (SEQ ID N0S:5 and 6, 15 ^rest cells, tlie cranial and dorsal root ganglia, and post 

respectively) have been found in the Drosophila Atonal and xnitotic cells in the CNS suggests an important possible link 

MASH (mammalian achaete-scute homolog) proteins. between expression and generation of sensory and motor 

lespecdvely, which are also thought to be involved in nerves. Expression occurring later in embryonic develop- 

neurogencsis. The NARER raotif (SEQ ID N0:7) of neuroD ^^^^ -^^ differentiating neurons in the CNS and in sensory 

is shared by other bHLH protcms, and the DrosophUa 20 organs (i.e., nasal epiUielium and retina) also supports a role 

Daughteiless (Da) and MammaHan E proteins. The basic development of the CNS and sensory nervous tissue, 

region of bHLH proteins is important for DNA binding site gince neuroD expression is transient, the results suggest that 

recognition, and there is homology between NeuroD and neuroD expression is operative as a switch controlling 

other neuro-proteins in this funcUonal region. Within the formation of sensory nervous tissue. It is noteworthy that in 

important dimer-dctcrmining HLH region of NeuroD, a low 25 studies neuroD expression was not observed in embry- 

level of homology was recorded with mouse twist protein Q^jp sympathetic and enteric gangUa (also derived from 

(i.e., 51% homology) and with MASH (i.e., 46% niigrating neural crest cells). OveraU, the results indicate 

homology). NeuioD contains several regions of unique that neuroD plays an important role in neuronal diffcrentia- 

peptide sequence within the bHLH domain including the j^qjj 
junction sequence (MHG). 30 



EXAMPLES 



EXAMPLE 4 



NeuroD is expressed in neural and brain tumor 

NeuioD is expressed in differentiating neurons cells: murine probes identify human neuroD 

during embryonic development 35 q-^^^ expression pattern in mouse embryo (Example 

neuroD expression was analyzed during embryonic devcl- Northern blots of tumor cell line mRNAs were examined 

opment of mouse embryos using in situ hybridization wiOi "sing murine ncuroD cDNA (Example 2) as a molecular 

an antisensc neuroD single-stranded riboprobe labeled with P^^be. As a first step, ceU lines that have the potential for 

digoxigenin (Bochringer Mannheim). Briefly, a riboprobe developing into neurons were saeened. The D283 human 

was prepared from plasraid pSK+1-83 using T7 polymerase ^ medullablastoma cell line, which expressed many neuronal 

and digoxigenin-ll-UTP for labeUng. The hybridized probe markers, expressed high levels of neuroD by Northern blot 

was detected using anti-digoxigenin antibody conjugated analysis. ncuroD was also transcribed at various levels by 

with alkaHne phosphatase. Color development was carried <iifferent human neuroblastoma ceU Unes and in certain 

out according to the manufacturer* s instruction. Stages of rhabdomyosarcoma lines that are capable of converting to 

development arc commonly expressed as days foUowing neurons. Murine PC12 pheochroraacytoma ceUs and P19 

copulation and where formation of the vaginal plug is eO.S. erabryocarcinoma cells differentiate into neurons in tissue 
The results recorded in the in situ hybridization studies were . ciUture in the presence of appropriate inducers, i.e.. nerve 

as follows: growth factor and retinoic acid, respectively. When induced, 

, u i-k murine PI 9 but not PC12 cells expressed neuroD transcripts. 

In the c9.5 mouse embryo, ncuroD expression was „ ^_ . , , • ,^^-.10 n r.in u j 

. . , ,r 50 However, non-mduced murine PC 12 ceils, P19 cells, and 

observed in the 'developmg trjgcrmjnal gangha, ^ 1 o^-j n * * a 1. »ii 1 1 c 

' *^ o c» control 3T3 fibroblasts did not produce detectable levels of 

In the elO.5 mouse embryo, a distinctive pattern of neuroD transcripts. Thus, PC 12 and P19 ceUs represent cell 

neuroD cxprcssioo was observed in aU the cranial ganglia potentially useful in screening assays for 

(I.C., V-XI) and in dorsal root gangHa (DRG) m the trunk identifying inducers of neuroD expression that may stimu- 

region of the embryo. At this time ncuroD expression was 55 j^^g ^^^^ regeneration and differentiation of neural tumor 

also observed in the central nervous system in post-mitotic ^^jjj 
cells in the brain and spinal cord that were undergoing 

neuronal differentiation. In the spinal cord, the ventral EXAMPLE 6 

portion of the cord from which the motor neurons arise and . 

differentiate was obser%'ed to express ncuroD at high levels; ^ Recombmant cells expressmg NeuroD 

and expression in the posterior-ventral spinal cord was Rccombinaat murine 3T3 fibroblast cells expressing 

higher when compared to more mature anterior-ventral either a myc-tagged murine NcuroD protein or myc-tagged 

spinal cord. Xenopus NeuroD protein were made. The recombinant cells 

In the eU.5 mouse embryo, the ganglionic expression were used as a test system for identifying antibody to 

pattern of neuroD observed in clO.5 persisted. ExjH:ession ia 65 NeuroD described below. 

the spinal cord was increased over the level of expression Xenopus NeuroD protein was tagged with the antigenic 

observed in clO.5 embryos, which is consistent with the marker Myc to allow the detemiination of the specificity of 
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anti-NeuroD antibodies to be detcimincd. Plasmid CS2+MT 99: 3 J I , J 987) were screened with the mouse cDNA insert 

was used to produce the Myc fusion protein. The CS2+MT as a probe at low stringency. The hybridization was per- 

vector (Turner and Weintraub, ibid.) contains the simian formed with 50% formamide/4x SSC at 33° C. and washed 

cytomegalovirus IE94 enhance/promoter (and an SP6 pro- with 2x SSC/0.1% SDS at 40° C. 

moter in the 5* untranslated region of the rE94-<iriven 5 Positive clones were identified and sequenced. Analysis 

transcript to allow in vitro RNA synthesis) operatively of the Xenopus ncuroD cDNA sequence (SEQ ID NOl^) 

linked to a DNA sequence encoding six copies of the Myc revealed that NeuroD is a highly conserved protein between 

epitope tag (Roth ct al, J. Cell BloL 115: 587-596, 1991; frog and mouse. The deduced amino acid sequences of frog 

which is incorporated herein in its entirety), a polylinkcr for and mouse (SEQ ID N0S:2 and 4) show 96% identity in the 

insertion of coding sequences, and an SV40 late pclyade- to tjHLH domain (50 of 52 amino acids are identical) and 80% 

nylation site. CS2-MT was digested with Xho I to Unearizc identity in the region that is carbo.xy-teiminal to the bHLH 

the plasmid at the polylinker site downstream of the DNA domain (159 of 198 amino acids are identical). The domain 

sequence encoding the myc tag. The linearized plasmid was structures of murine and Xenopus NeuroD are highly 

blunt-ended using Klenow and dNTPs. A fuU length Xeno- homologous wiUi an "acidic" N-terminal domain (i.e., 

pus cDNA clone was digested with Xho I and Eae I and 15 glutamic or aspartic acid rich); a basic region; helix 1, loop, 

blunt-ended using Klenow and dNTPs, and the 1.245 kb helix 2; and a proline rich C-tcrrainal region. Although the 

fragment of the Xenopus neuroD cDNA was isolated. The amino terminal regions of murine and Xenopus NeuroD 

neuroD fragment and the linearized vector were ligated to differ in amino acid sequence, both retain a glutamic or 

fdm plasmid CS2+MT xl-83. aspartic acid rich "acidic domain" (amino adds 102 to 113 

CS2+MT was digested with Eco RI to lineariize the of SEQ ID N0:2 and amino acids 56 to 79 of SEQ ID N0:4). 

plasmid at the polylinker site downstream of the DNA It is highly likely that the acidic domain constinites an 

sequence encoding tlie myc tag. The linearized plasmid was "activation" doiuain for the NcuroD protein, in a manner 

blunt-ended using Klenow and dNITs and digested with analogous to the activation mechanisms currently under- 

Xho I to obtain a linearized plasmid having an Xho I stood for other known transcription regulatory fartOTS. 
adhesive end and a blunt end Plasmid pKS-Hm7a containing 

a partial murine neuroD cDNA was digested widi Xho I, and EXAMPLE 9 
the NeuroD containing fragment was blunt-ended and 

digested with Xba I to obtain the approximately 1.6 kb Neuronal expression of Xenopus neuroD 

fragment of the murine ncuroD cDNA ITie ncuroD frag- The expression pattern of neuroD in whole mount Xeoo- 

mcnt and the linearized vector were ligated to form plasmid 30 embryos was determined using in situ hybridization with 

CS2+MT Ml-83(ra7a). ^ single stranded digoxigenin-labded Xenopus ncuroD anti- 

Plasraids CS2-I-MT xl-83 and CS2+MT Ml-83(ra7a) sense cDNA riboprobe. Embryos were examined at several 

were each trausfoiraed into mmine 3T3 fibroblast cells and different stages. 

used as a test system for identifying antibody against Ncu- CoDsistent with the mouse expression pattern, by late 

roD (Example 7). stage, all cranial ganglia showed very strong staining pat- 
terns. In Xenopus, as in other vertebrate organisms, neural 

^^^^^^^^^ ^ crest cells give rise to skeletal components of the head, all 

A HI r f M r» ganglia of the peripheral nervous system, and pigment cells. 

Anuixxues to iNeuroD ^ Among these derivatives, the cranial sensory gangUa, which 

A recombinant fusion protein of maltose binding protein mixed crest and placode origin, represent the only 

(MBP) and amine acid residues 70-355 of murine NeuroD &ovi^ of cells that express neuroD. High levels of neuroD 

was used as an antigen to evoke antibodies in rabbits. expression in the eye were also observed, correlating with 

SpedHcity of the resultant antiscra was confirmed by immu- active neuronal dififercntiation in the retina at this stage, 

nostaining of the recombinant 3T3 cells described above. 45 Expression is observed in the developing olfactory placodes 

Double-immunostaining of the recombinant cells was and otic vesicles, as was seen in mice. The pineal gland also 

observed with monoclonal antibodies to Myc (Le., the expressed neuroD. All of this expression in transient, sug- 

control antigenic tag on the transfcctcd DNA) and with gcsting that neuroD functions during the differentiation 

rabbit anti-murine NeuroD in combination with anti-rabbit process but is not required for maintenance of these diffcr- 

IgG. The specificity of the resultant anti-murine NeuroD 50 en^^t^ cell types. 

sera was investigated further by preparing mouse 3T3 fibro- As early as stage 14 (i.e.. the mid-neurula stage) neuroD 

blasts cells transfccted with different portions of NeuroD expression was observed in the cranial neural crest region 

DNA. Specificity seemed to map to the glutamic acid-rich where trigerminal ganglia differentiate. Primary mecha- 

domain (Le., amine acids 66-73 of SEQ ID N0:2), The nosensory neurons in the spinal cord, also referred to as 

anti-murine anttsera did not react with ceDs transfected with 55 Rohon-Beard cells and primary motor neurons, showed 

the myc-tagged Xenopus neuroD. In a similar manner, neuroD expression at this stage. 

Xenopus NeuroD was used to generate rabbit anti-NeuroD By stage 24, all of the developing cranial ganglia, 

antiscra. llic antiscra was Xenopus- specific and did not trigerminal, facio-acoustic, glosso-phaiyngeal, and vagal 

cross react with cells transfected with myc-tagged murine nervous tissues showed a high level of ncuroD expression, 

neuroD. ^ High levels of expression of neuroD was also observed in 

the eye at this stage. (Note that in Xenopus neuronal 

^^^^^■^^^^ ^ differentiation in the retina occurs at a much earlier stage 

XT • I.' 1.1 1 .1 , mice, and ncuroD expression was correspondingly 

NeuroD is a highly evoluUonarily conserved ^«^u^, ♦u*. - • 1 j 1 \ 

^ ^ VT earlier and stronger m this animal model.) 

protem: sequence of Xenopus NeuroD , . _ 

65 In summar>', m Xenopus as in mouse, neuroD expression 

Approximately one millioo clones from a stage 17 Xeno- was correlated with sites of neuronal differentiation. The 

pus head library made by Kintner and M elton (Development remarkable evolutionary conservation of the pattern of neu- 
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roD expression ij] difFerentiating neurons supports the DOtiOD diluted 1:2000. Embryos stained for NF-M were fixed in 
that NeuroD has been evolutionarily conserved boOi struc- Dent's fiAaU've (20% dimelhylsulf oxide/80% methanol) and 
turaJly and functionally in these distant dasses, wlxich cleared in 2:1 benzyl benzoatc/benzyl alcohol as described 
underscores the critical role performed by this protein in by E)ent et at. {Development 105: 61» 1989, which is 

embryonic development. 5 incorporated by reference herein). In situ hybridization of 

embryos was carried out essentially as described by Harland 

EXAJ^IPLE 10 (in Methods in Cell Biology, B. K. Kay, H. J. Pcnd, Eds, 

„^ . . c Academic Press, New York, N.Y., Vol 36, pp. 675-<385, 

Ectopic expression of neuroD converts non- 1991. which is incorporated by reference herein) as modified 

neuronal ccUs into neurons j^^^ Weintraub (ibid.). In situ hybridization with 

To further analyze the biological functions of NeuroD. a p-tubuUn without RNase U-eatraent can also detect tubulin 
gain-of-fuDction assay was conducted. In this assay, RNA expression in the ciliated epidermal cells. AH of Ihese 

was microinjected into one of the two cells in a 2-ceU stage markers displayed ectopic sting on the neuroD RNA injected 
Xenopus embryo, and the cEFccts on later development of side. Injection of neuroD mRNA into vegetal cells led to no 

neuronal phenotype was evaluated. For these experiments ectopic expression of neural markers except in one embryo 

rayc-tagged neuroD Transcripts were syndiesized in vitro that showed internal N-CAM staining in the trunk region, 

using SP6 RNA polymerase. The rayc tagged-neuroD tran- suggesting the absence of cofactors or the presence of 

scripts were microinjected into one of the two cells in a inhibitors in vegetal cells. However, the one embryo that 

Xenopus 2-ccll embryo, and the other cell of the embryo showed ectopic neurons in the internal organ tissue suggests 

served as an internal control. Antibodies to Xenopus 20 ^^^^ ^® possible to convert non-ectodermal lineage 

N-CAM, a neural adhesion molecule, anti-Myc (to detect the cells into neurons under certain conditions, 

exogenous protein), and immunostaining techniques were The embryos were also stained with markers that detect 

used to evaluate phenotypic expression of the neuronal Rohon-Beard cells (cells in which neuroD is normally 

marker (and control) gene during the subsequent develop- expressed), limnuno staining using tlie method described 

mental stages of the microinjected embryos. Remarkably, an 25 above for Rohon-Beard ceU-spccific markers such as 

evaluation of over 130 embryos that were injected with HNK-1 (Nordlandcr, Dev. Brain Res. 50: 147, 1989, which 

neuroD RNA showed a striking increase in ectopic expres- is incorporated by reference herein) at a dilution of 1:1, 

sion of N-CAM on the microinjected side of the embryo Islet- 1 (Ericson ctal.y Science 256: 1555, 1992 and Korzh et 

(Lc, Myc"0, as judged by increased iimnunostainiDg. The al.. Development 118: 417, 1993) at a dilution of 1:500, and 

increasftl staining was observed in the region from which 30 in situ hybridization as described above with shaker- 1 

neural crest cells normally migrate. It is considered likely (Ribera et al., J. Neuroscl 13: 4988, 1993) showed more 

that ectopic expression (or over-expression) of neuroD cells staining on the injected side of the embryos, 

caused neural crest stem cells to follow a neurogenic cell The combined results support the notion that ectopic 

fate. Outside the neural tube, the ectopic irarauaostaining expression of NeuroD induced diifereniiation of neuronal 

was observed in the faciocranial region and epidermal 35 cells from cells that, without neuroD microinjection, would 

layer, and in some cases the stained cells were in the ventral have given rise to non-neuronal cells. In summary, tiiesc 

region of the embr>'0 far from the neural tube. ITie immu- experiments support Uie notion that ectopic neuroD expres- 

nostained cells not only expressed N-CAM ectopically, but sion can be used to convert a non-neuronal cell (i.e., 

displayed a morphological phenotype of neuronal cells. At uncommitted neural crest cells and epidermal epithehal 

hi^ magnification, the N-CAM expressing cells exhibited 40 basal stem cells) into a neuron. These findings offer for the 

typical neuronal processes reminiscent of axonal processes. first time the potential for gene therapy to induce neuron 

To coniirm that the ectopic N-CAM expression resulted formation in injured neural tissues, 

from a direct effect on the presumptive epidermal cells and Interesting morphological abnormalities were observed in 

not from aberrant neural cell migration into the lateral and the microinjected embryos. In many cases the eye on the 

ventral epidermis, neuroD RNA was injected into the top tier 45 microinjected side of the embryo failed to develop. In other 

of 32-ceU stage embryos, in order to target the injection into embryos, the spinal cord on the microinjected side of the 

cells destined to become epidermis. N-CAM gaining was embryo failed to develop properly, and the tissues were 

observed in the lateral and ventral epidermis without any strongly inMnunopositive when stained with anti-N-CAM. In 

noticeable effect on the endogenous nervous system, indi- addition, at the mid-neurula stage many microinjected 

eating that the staining of N-CAM in the epidermis repre- 30 embryos exhibited an increase in cell mass in the cranial 

scnts the conversion of epidermal cell fate into neuronal cell region of the embryo from which (in a normal embr>'o) the 

^atc. neural aest cells and their derivatives (i.e., cranial gangU- 

Ectopic generation of neurons by neuroD was confirmed onic cells) would migrate. The obsen'ed cranial bulge exhib- 

with other neural specific markers, such as neural- specific ited strong immunostairdng with antibodies specific for 

dass n p-tubulin (Richtcr et z.i.,Proc. Natl Acad. Set. USA 55 N-CAM. These results were interpreted to mean that mor- 

85: 8066, 1988), acctylated alpha-tubulin (Pipemo and phological changes in the eye, neural crest, and spinal cord 

RiUcr, J. Cell Biol 101: 2085, 1985), tanabin (Hemmnti- resulted from premature neural differentiation which altered 

Brinvanlou et at., Neuron 9: 417, 1992), neurofilament the migration of neural and neural crest precursor cells. 

(NF)-M (Szaro et at., J. Camp. Neurol 273: 344, 1988), and NeuroD-injected embryos were also assayed for alteration 

Xcn-1,2 (Ruiz i Altaba, Development 115: 67, 1992), The 60 in tlie expression of Xtwist. the Xenopus horaolog of Droso- 

erabryos were subjected to immunochcmistry as described phiia tiA^ist, to determine whether ncuroD converted non- 

by Turner and Weintraub {Genes Dev. 8:1434, 1994, which neuronal components of neural crest cells into the neural 

is incorporated by reference herein) using primary antibod- lineage. In wild-type embryos, Xtwist is strongly expressed 

ies detected with alkaline phosphatase-conjugated goat anti- in the non-neuronal population cephalic neural crest cells 

mouse or anti-rabbit antibodies diluted to 1:2000 65 that give rise to the connective tissue and skeleton of the 

(Bochringcr-Mannhcim). Anti-acctylatcd alpha-tubulin was head. neuroD-injected embryos were completely missing 

diluted 1:2(X)0. Anti-Xen-1 was diluted 1:1. Anti-NF-M was Xtwist expression in the migrating cranial neural crest cells 
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on the injected side. The failure to gCDcrate sufficient cranial mouse neuroD cDNA. Host E coli strain LE392 (New 

mesenchynaal neural crest precursors in neuroD-injected England Biolabs) were grown in LB+10 mM MgSO^O.2% 

embryos was also observed morphologically, since many of mallose overniglit at 37° C, Tlie cells wcje harvested ajid 

tlie injected embryos exhibited poor branchial arch devel- rcsuspended in 10 mM MgS04 to a final OD600 of 2. The 

opmentin the head. Furthermore, the increased mass of cells 5 resuspended cells were used as hosts for phage infection, 

in the cephalic region stained very strongly for N-CAM, The optimal volume of phage stock for use in this saeening 

p-tubulin. and Xen-1, indicating that Uiese cells were neural was determined by using serial dilutions of the phage stock 

in character, of a human fibroblast genomic library in lambda FIX 13 

The converse experiment in which frog embryos were (Stratagcne) to infect LE392 ceUs (New England Biolabs). 

injected with Xtwist mRNA showed that ectopic expression To obtain approxijnately 50,000 plaques per plate, a 2.5 pi 

of Xtwist significantly decreased neuroD expression on the aliquot of the phage stock was used to infect 600 jil of the 

injected sidc^ Thus, two members of the bllLH family, rcsuspended L£392 ceUs. The ceUs were incubated with the 

neuroD and Xtwist, may compete f or definmg the identit)' of tiage for 15 minutes at 37" C after which the ceUs were 

different ceU types denved from the neural aest. In the j^ixed %vith 6.5 ml of top agar warmed to 50° C The top agar 

neuroD-injectcd embryos, exogenous ncuroD may induce „t„»^^ ^ ..i-^i-n a - ^ w . a • .ttoz-T a 

pxemigratory neural crwt to diifocntialc into neurons In situ, »5 '^J^^ff^ °" "''^""ated ovenught at 37° C. A 

and consequenUy they fail to luigrate to their noimal posi- ""^ 15-cm pialcs were prepared in Uus manner, 

lions. Duplicate plaque lifts were prepared. A first set of Hybond 

The etfect of introduction of exogenous neuroD on the niembraDcs (Amersham) were placed onto the plates aad 

fate of ccUs that normally express neuroD, such as cranial allowed to sit for 2 minutes. The initial membranes were 

ganglia, eye, otic vesicle, olfactory organs, and primary removed and the duplicate membranes were laid on the 

neurons, and on other CNS cells that normally do not plates for 4 minutes. The membranes were allowed to air 

express ncuroD, was determined by staining for differentia- tbcn the phage were denatured in 0.5M NaOU, 1.5M 

tion markers. When the cranial region of the embryo is NaCl for 7 minutes. The membranes were neutralized with 

severely affected by ectopic ncuioD, tlie injected side of the two washes in ncutraJiz^ation buffer (L5M NaCl, 0.5M Tris, 

embryos displayed cither small or no eyes in addition to 25 pH 7.2). Alter neutralization, the membranes were 

poorly organized brains, otic vesicles, and olfactory organs. crosslinked by exposure to MW. A 1 kb Eco RI-Hind ni 

Moreover, as the embryos grew, tlie spinal cord showed fragment containing murine ncuroD coding sequences was 

retarded growth, remaining thinner and shorter on the random primed using the Random Priming Kit (Bochringer 

neuroD-injected side. Mannheim) according to the manufacturer* s instructions. 

N-CAM staining in the normal embryo at early stages was 30 Membranes were prepared for hybridization by placing six 

not uniforru throughout Die entire neural plate, but raUier membranes in 10 ml of FBI hybridization bu£fer flOO g 

was more prommentm the medial region of the neural plate. polyethylene glycol 800, 350 ml 20% SDS, 75 ml 20x 

Injected embryos analyzed for N-CAM expression show that SSPE; add water to a final volume of one liter] and incu- 

the neural plate on the injected side of the early stage bating tlie membranes at 65^ C. for 10 minutes. After 10 

embryos was stained more intensely and more lateraUy. The 35 minutes, denatured sahnon sperm UNA was added to a final 

increase m N^CAM staining was not associated v.dth any conrcntration of J 0 pg/mJ and denatured probe was added to 

lateral expansion of the neural plate as assayed by visual ^ ^^1 concentration of 0.25-0.5x10'^ cpm/ml. The mera- 

mspection and staining with the epidermal marker EpA. This branes were hybridized at 65° C. for a period of 8 hours to 

was in contrast to what has been observed with XASH-3 overnight. After incubation, the SDS for 30 minutes at 50" 

injertion that causes neural plate expansion. These obser- ^ c. The first wash was foUowcd by a final wash in 0.1 x SSC, 

vations suggest that the first effects of ncuroD are to cause o.l% SDS for 30 minutes at 55** C AutOradiographs of the 

neuronal precursors in the neural plate to differentiate pre- membranes were prepared. The hrst screen identified 55 

maturely. putative positive plaques. Thirty-one of the plaques were 

To determine whether neuroD caused neuronal precursors subjected to a secondary screen using the method essentially 

to differentiate prematurely, injected embryos were stained 45 set forth above. Ten positive clones were identified and 

using two neuronal markers that are expressed in differen- subjected to a tertiary screen as described above. Eight 

tiated neurons, neural specific p-tubulin and tanabin. In situ positive clones were identified after the tertiary screen. Of 

hybridization for P-mbuUn and tanabin was carried out as these eight clones, three (14B1, 9F1 and 20A1) were chosen 

described above. Ovcr-e:spression of neuroD dramatically for further analysis. Qones 14B1 and 20A1 were deposited 

increased the [^tubulin signals in the region of the neural 5^ at the American Type Culture CoUcction, 1: 2301 Parklawn 

plate containing both motor neurons and Rohon-Beard ceUs Drive, Rock\'ilJe, Md. 20852 USA, on Nov. 1, 1995, under 

at stage 14. The eaiHest ectopic p-tubuUn positive cells on accession numbers 69943 and 69942. respectively, 

the injected side were observed at the end of gastrulaUon phagc DNA was prepared from clones i4Bl, 9F1, and 

when the control side did not yet show any [Vtubuiin 20A3.Thc J4B1 and 20A1 phage DNA were digested with 

positive ceUs. Tanabin was also expressed in more cells in 55 p^t i to isolate the 1 .2 kb and 1.5 kb fragments, respectively, 

the spinal cord m the neuroD injected side of the embryos at that hybridized to the mouse neuroD probe. The 9F1 phage 

stage 14. These results suggest that neuroD can cause dNA was digested with Eco RI and SacI to obtain an 

premature differentiation of tlie neural precursors into dif~ approximately 2.2 kb fragment that hybridizes with the 

lerentiatcd neurons. This is a powerful indication that, when niouse neuroD probe. The fragments were each subcloned 

ectopically expressed or over-expressed. NeuroD can dif-^ into plasmid BLUESCRIPr SK (Slratagene) that liad been 

ferenUate imtotoc ceUs into non-dividing mature neurons. linearized with the appropriate restriction enzyme(s). Ihe 

EXAMPLE 11 fragments were sequenced using Scqucnasc Version 2.0 (US 

„ . , - ^ . Biochemical) and the following primers: the universal 

Human genomic clones of neuroD. neuroD2 and p^^^r M13-21, the T7 primer, and the T3 primer. Sequence 

"^^^^•^ 65 analysis of clones 9F1 (SEQ ID N0S:8 and 9), and 14B 1 

Genomic clones encoding human NeuroD were obtained (SEQ ID NOSilO and U ) showed a high similarity between 

by probing a human fibroblast genomic library with the the mouse and human coding sequences at both the amino 



5,695,995 



23 



24 



acid and nucleodde level. In addition, while clones 9F1 and 
14B1 shared 100% identity in the HLH region at the amino 
acid level (i.e., residues 117-156 in SEQ ID N0:9 and 
residues 137-176 in SEQ XD NO: 11), they diverged in the 
amino-terminal of the bHLH. This finding strongly suggests 5 
that 14B1 is a member of the neuroD family of genes. 
Sequence analysis demonstrates that clone 9F1 has a high 
degree of homology throughout the sequence region that 
spans the translation start site to the end of the bHLH region. 
The 9F1 clone has 100% identity to mouse NeuroD in the lO 
HLH region (i.e., residues 117-156 in SEQ ID N0:9 and 
residues 117-156 in SEQ ID N0:2), and an overall identity 
of 94%. The 14B1 clone also has 100% identit>' to the HLH 
region (i.e., residues 137-176 in SEQ ID NO: 11 and resi- 
dues 117-156 in SEQ H) N0:2), but only 40% identity to 15 
9F1 and 39% identity to mouse NeuroD in the amino- 
tcrminal region. This demonstrates that 9F1 is the human 
horaolog of mouse neuroD, whereas the strong conservation 
of the NeuroD HLH identifies 14B1 as another member of 
the ncuroD HLH subfamily. Human clone 9F1 (represented 20 
by SEQ ID N0S:8 and 9) is referred to as human neuroD. 
Human clone 14B 1 is referred to as neuroD2 (SEQ ID 
NOS:10 and 11, and human clone 20A1 is referred to as 
neuroD3 (SEQ ID N0S:12 and 13). 

An 800 bp Hind III-Eag I fragment from the neuroD2 35 
sequences from clone 14B1 was random primed with ^^P. 
This probe was used to screen a 16-day mouse embryo 
cDNA library essentially as described previously. Filters 
were prehybridized in FBI hybridization buffer (see above) 
at 50® C for 10 minutes. After prehybridization, denatured 30 
salmon sperm DNA was added to a final concentration of 10 
pg/ral; denatured probe was added to a iinai concentration of 
one million cpm/ml. The filter was hybridized at 50° C. 
overnight. After incubation, excess probe was removed, and 
the filter was washed first in 2x SSC, 0-1% SDS for 30 
minutes at 60° C. One clone, designated 1.1.1, contained 
1.46 kb of murine neuroD2 cDNA as an Eco RI-Hind IHI 
insert. The nucleodde sequence and deduced amino acid 
sequences are shown in SEQ ID NOS:16 and 17. respec- 
tively. A comparison between the human genomic sequence 
and the mouse cDNA sequence demonstrate that there were 
no introns in the human neuroD2 coding region. 

In a similar manner, a random-primed 1.1 kb Pst I 
fragment from the human ncuroDS cDNA present in the 
20A1 clone is prepared. The probe is used to screen a mouse 
embiyo and newborn mouse brain libraries. Hybridization 
and wash conditions are as described above. Positive clones 
arc analyzed by restriction and sequence analysis, and a full 
length clone is obtained. The mouse neuroD3 cDNA is used 
to prepare a probe for Northern analysis to study expression 
patterns in embryonic through adult mice. 

• Using a random-primed anUseuse probe to the mouse 
neuroD2 (Boehringer Mannheim) the expression pattern 
was determined using Northern analysis. Filters cootaining 
murine RNA from the brain and spinal cords of embryonic 
through adult mice were probed at high stringency and 
washed in O.lx SSC, 0.1% SDS at 65° C Northern analysis 
showed neuroD2 expression in the brain and spinal ccwds of 
mice from embryonic day 12.5 through adulL ^ 

EXAMPLE 12 

Chromosome mapping of human neuroD clones 

FISH karyotyping was performed on fixed metaphase 65 
spreads of the microcell hybrids essentially as described 
(Trask et al., Am. J. limn. Genet. 48: 1-15, 1991; and 
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Brandriff et al.. Genomics 10: 75-82, 1991; which arc 
incorporated by reference herein in their entiret>'). neuroD 
sequences were detected using the 9F1 or 20A1 phage DNA 
as probes labeled using digoxigcnin dUTP (Boehringer 
Mannlieira) according to the manufacturer's instructions. 
Phage DNA was biotinylated by random priming (Gibco/ 
BRL BioNick Kit) and hybridized in situ to denatured 
metaphase cliromosome spreads for 24-48 hours. Probes 
were detected with rhodamine-conjugated antibodies to 
digoxigenin, and chromosomes were counierstained with 
DAPI (Sigma). Signals were viewed tlirough a fluorescence 
microscope and photographs were token with color slide 
film. FISH analysis indicated clone 9F1 maps to human 
chromosome 2q, and clone 20A1 maps to human cliromo- 
some 5- 

Chroraosome mapping was also carried out on a human/ 
rodent somatic cell hybrid panel (National Institute of Gen- 
eral Medical Sciences, Camden. N J.). This panel consists of 
DNA isolated from 24 human/rodent somatic cell hybrids 
retaining one human chromosome. For one set of 
experiments, the panel of DNA's were digested with Eco RI 
and electrophoresed on an agarose gel. The DNA was 
transferred to Hybond-N membranes (Aincrsham). A ran- 
dom primed (Boehringer Mannheim) 4 kb Eco Rl-Sac I 
fragment of clone 9F1 was prepared. The filter was prehy- 
bridized in 10 ml of FBI hybridization buffer (see above) at 
65° C. for 10 minutes. After prehybridization, denatured 
salmon sperm DNA was added to a final conceotration of 10 
fig/ml; denatured probe was added to a final concentratiOD of 
one million cpiWml. The filter was hybridized at 65*^ C. for 
a period of 8 hours to overnight After iQcubation, excess 
probe was removed, and the filter was washed first in 2x 
SSC, 0.1% SDS for 30 minutes at 65^ C. The first wash was 
followed by a final wash in O.lx SSC, 0.1% SDS for 30 
minutes at 65° C. An autoradiograph of the filter was 
prepared. Autoradiographs coniumed the FISH mapping 
results. 

Id the second experiment, the panel was digested with Pst 

I. electrophoresed and transferred essentially as described 
above. A random-primed (Boeliringer Mannlieim) 1 .6 kb Psi 
I fragment of clone 20A1 was prepared. The membrane was 
prehyhridized, hybridized with the 20A1 probe and washed 
as described above. Autoradiographs of the Southern 
showed (hat 20A1 mapped to human chromosome 5 and 
confirmed the FISH mapping results. After autoradiography, 
the 20A1 -probed membrane was stripped by awash in 0.5M 
NaOH, 1.5M NaCl. The membrane was neutralized in 0.5M 
Tris-HCl (pH 7.4), 1.5M NaCl. The filter was washed in 
O.lx SSC before prehybridization, A random-primed 
(Boehringer Mannheim) 1.2 kb Pst I fragment of clone 14B1 
was prepared. The washed membrane was prehyhridized 
and hybridized with the 14B1 probe as described above. 
Alter washing under the previously described conditions, the 
membrane was autoradiographed. Autoradiographs demon- 
strated that clone 14B1 mapped to chromosome 17. 

EXAMPLE 13 

Human neuroD cornpleraentary DNA 

To obtain a human neuroD cDNA, one million plaque 
formiug units (pfu) were plated onto twenty UJ+IO mM 

MgS04 (150 mm) plates using the Stratagcne human cDNA 
library in Lambda ZAP II in the bacterial surain XL-1 Blue 
(Stralagene). Plating and membrane lifts were performed 
using standard methods, as described in Example 11. After 
XJV cross-linking, the membranes were pre-hybridized in an 
aqueous hybridization solution (1% bovine serum albumin, 
1 mM EDTA. 0.5M Na2HP04 (pH 7.4), 7% SDS) at 50° C. 
for two hours. 
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The mouse neuroD cDNA insert was prepared by digest- 
ing the pKS+m7a RX plasmid with Eco RI and Xho I, and 
isolatiog the fragment cootaining the cDNA by electroelu- 
tion. Aprobe was made with the cDNA containing fragment 
by random pnmed synthesis with random hexanucleotidcs, 
dGTP, dATP, dTTP, alpha-^^P-labelcd dCFP, and Klenow in 
a buffered solution (25 niM Tris (pH6.9), 50 raM KCl, 5 mM 
MgOj, 1 raM DTT), The probe was purified from the 
unincorporated nucleotides on a G-50 sepharose column. 
The purified probe was heat denatured at 90** C. for 3 
minutes. 

After prehybridization. the denatured piobc was added to 
the membranes in hybridization solution. The membranes 
were hybridized for 24 hours at 50® C. Excess probe was 
removed from the membranes, and the membranes were 
washed in 0. Ix SSC, 0. 1 % SOS for 20 minutes at 50' C. The 
wash solution was changed five times- The membranes were 
blotted dry and covered with plastic film before being 
subjected to autoradiography. Autoradiography of the filters 
identified 68 positive clones. The clones are plaque-purified 
and rescreened to obtain 40 pure, positive clones. The 
positive clones were saeened with a random-primed Pst I 
fragment from clone 9F1 (human neuroD). Twelve positive 
clones that hybridized with Uie hiunan neuroD genomic 
probe were isolated. 

Theplasraid vector containing cDNA insert was excised 
in vivo from the lambda phage clone according to the 
Stratagcnc methodology. Briefly, eluted phage and XL-1 
Blue cells (2(X) microliters of OD 600=1) were mixed with 
R408 helper phage provided by Stratagcnc for 15 minutes at 
37^* C Five milliliters of rich bacterial growth media (2 X 
YTy see Sambrook et at., ibid.) was added, and the cultures 
were incubated for 3 hours at 37° C. The tubes were heated 
at 70° C. for 20 minutes and spun for 5 minutes at 4,000xg, 
After centrifugation, 200 microliters of supernatant was 
added to the same volume of XL-1 Blue cells (0D=1), and 
the mixture was incubated for 15 minutes at 37^* C, after 
which the bacterial cells were plated onto LB plates con- 
taining 50 jig/ral ampiciUin, Eadi colony was picked and 
grown for sequencing template preparation. The clones were 
sequenced and coinpared to the human genomic sequence. A 
full length cDNA encoding human ncuioD that was identical 
to the 9F1 neuroD genomic sequence was obtained and 
designated HC2A. The nucleotide and deduced amino acid 
sequences arc shown in SEQ ID NOS:14 and 15 , respec- 
tively. Clone HC2A was deposited at the American Type 
Culture Collection, 12301 Parklawn Drive, Rockville, Md. 
20852 USA, on Nov. L 1995. under accession number 
69944. 

EXAMPLE 14 

Construction of knock-out mice 

Knock-out mice in wliich the murine neuroD coding 
sequence was replaced with the p-galactosidase gene and the 
neomycin resistance gene (nco) were generated i) to assess 
the consequences of eliminating the murine NeuroD protein 
during mouse development and ii) to permit examination of 
the expression pattern of neuroD in embryonic mice. 
Genomic nem'oD sequences used for these knock-out mice 
were obtained from the 129/Sv mice so that the homologous 
recombination could take place in a congeoic background in 
129/Sv mouse embryonic stem cells. Several murine neuroD 
genomic doncs were isolated from a genomic library pre- 
pared from 129/Sv mice (Zhuang et at.» Cell 79: 875-^84, 
1994; which is incorporated herein by reference in its 
entirety) using the Bam HI-Not I neuroD cDNA containing 



fragment of pSK+1-83 (Example 2) as a random-primed 
probe essentially as described in Example 1 1. Plasmid pPNT 
(lybulewicz et at., Cell 65: 1153-1163, 1991; which is 
incoiporaled herein by reference in its entirety) containing 
5 the neomycin resistance gene (neo; a positive selection 
marker) and the Herpes simplexviius thymidine kinase gene 
(hsv-tk, a negative selection marker) under the control of the 
PGK promoter provided the vector backbone for the tvgct- 
ing construct A J.4 kb 5' murine neuroD genomic fragment 
JO together with the 3 kb cytoplasmic p-galactosidase gene 
were inserted between the Eco RI and Xba I sites of the 
pPNT vector, and an 8 kb fragment containing the genomic 
3' untranslated sequence of neuroD was inserted into the 
vector backbone between into the Xho I and Not I sites. 
15 To prepare an Eco Rl-Xba I fragment containing neuroD 
promoter sequences joined to the p-galactosidase gene, a 1.4 
kb Eco RI(veclor-derived)-Asp718 fragment containing the 
5' untranslated murine neuroD genomic sequence was 
ligated to a Hind m-Xba I fragment containmg the cyto- 
20 plasmic p-galactosidasc gene such that the Asp 718 and 
Hind in sites were destroyed. The resulting approximately 
4.4 kb Eco Rl-Xba I fragment, containing the 5' neuroD 
genomic sequence (including the neuroD promoter) and the 
[5- gal act OS i da se gene in the same transcriptional orientation, 
25 was inserted into Eco Rl-Xba T linearized pI*NT to yield the 
plasmid pPNT/5'-fp-gal. A neuroD fragment containing 3' 
untranslated DNA was obtained from a murine neuroD 
genomic clone that had been digested with Spe I and Not 
I(vector-derived) to yield an 8 kb fragment. To obtain a 5* 
30 Xho I site, the 8 kb fragment was inserted into Spe TNot 1 
linearized pBlucslcriptSK-f(Stratagene), and the resulting 
plasmid digested with Xho I and Not I to obtain the 8 kb 
neuroD 3' genomic fragment. The Xho I-Not I fragment was 
inserted into Xho I-Not I linearized pPNT/5'+P-gal to yield 
35 the neiuroD targeting vector. The fmal construct contained 
the 5' neuroD fragment, tlie p-galactosidase gene, and the 3' 
genomic neuroD fragment in the same orientation, and the 
hsv-tk and neomycin resistance genes in the opposite ori- 
entation. 

40 The Lirgcting construct was transfcctcd by electroporation 
into mouse embryonic stem (ES) cells. A 129/Sv derived ES 
cell line. AK-7 described by Zhuang et at. (ibid.) was used 
for electroporation. These ES cells were routinely cultured 
on mitomycin C-treated (Sigma) SKL 76/7 cells (feeder 
45 cells) as described by McMahon and Bradley {Cell 62: 
1073-1085, 1990; which is incorporated herein by reference 
in its entirety) in culture medium containing high glucose 
DMEM supplemented with 15% fetal bovine serum 
(Hyclone) and 0. 1 |iM p-raercaptoethanol. To prepare the 
50 targeting construct for transfection, 25 MS of the targeting 
construct was linearized by digestion with Not I, phenol- 
chloroform extracted, and ethanol precipitated. The linear- 
ized vector was then electroporated into 1-2x10^ AK-7 (ES) 
ceUs.The elecUoporateU cells were seeded onto three 10-cra 
55 plates, with one plate receiving 50% of the electroporated 
cells and the remaining two plates each receiving 25% of the 
electroporated cells. After 24 hours, G418 was added to each 
of tlie plates to a final concentration of 150 p/ral. After an 
additional 24 hours, gancyclovir was added to a final con- 
60 centration of 0.2 (xM to the 50% plate and one of the 25% 
plates. The third plate containing 25% of the electroporated 
cells was subjected to only G418 selection to assess the 
efficiency of gancyclovir selection. The culture medium for 
each plate was changed every day for the first few days, and 
65 then changed as needed after selection had occmred. After 
10 days of selection, a portion of each colony was picked 
microscopically witli a drawn luicropipette, and was directly 
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analyzed by PCR as described by Joyner et al. {Nature 338: The resulting offspring (Fl) heterozygous (±) mice, were 

153-156, 1989; which is jDCorporatcd herein by refaence in mated with sibling heterozygous mice to give rise to the 

its entirety). Briefly. PCR araplification was performed as homozygous (-/-) mutant mice. 

describcd(Koganctat.,Afffw£/i^/an^/.Af erf. 317:985-990, To study neuroD expression patterns in embryonic mice, 
1987; which is incorporated herein by reference in its 5 chiraeri c mice or Fl lielCTOzygous progeny from tlie chimera 

entirety) using 40 cycles of 93° C for 30 seconds, 57° C. for x C57B/6J mating were crossed with C57B/6J. Utters 

30 seconds, and 65° C for 3 minutes. To detect the wild-type resulting from these crosses were harvested from pregnant 

aUclc, primers JU4 and J136 (SEQ ID N0S:18 and 19, females and stained for p-galactosidase activity. The 

respcctiYely) were used in the PCR reaction, to detect the embr>'os were dissected away from all the extra-embiyonic 
mutant neuroD allele, primers JL34 and IL40 (SEQ ID lO tissue and the yolk sac was reserved for DNA analysis. The 

NOS:18 and 20, respectively) were used in the PCR reac- embryos were fixed for one hour in a Fix solution (O.IM 

tion. Positive colonies, identified by PCR, were subcloned phosphate buffer containing 0.2% glutaraldehydc, 29b 

into 4-wcll plates, expanded into 60 turn plates and frozen formaldehyde, 5 mM EGTA )pH 7.3), 2 mM MgClz). The 

into 2-3 ampules. fixing solution was removed by three thirty-minute rinses 

Among the clones that were selected for both G41$- with rinse solution (O.IM phosphate buffer (pH 73) con- 
resistance (positive selection for neo gene expression) and taining 2 mM MgCls* 0.1% sodium deoxycholate, 0.2% 
gancyclovir-rcsistance (negative selection for hsv-tk gene NP-40). The fixed embryos were stained overnight in tlie 
expression), 10% of the population contained correctly dark in rinse solution containing 1 mg/ml X-gal, 5 mM 
targeted integration of the vector into the murine neuroD sodium ferricyanide, 5 raM sodium ferxocyanide. After 
locus (an overall 10% targeting frequency) The negative staining, the embryos were rinsedwith PBS and stored in the 
selection provided 4—3 fold enrichment for homologous Fix solution before preparation for examination. Examina- 
recombination events. tion of stained tissue from fetal and postnatal mice heterozy- 

To generate chimeric mice, each positive clone was for the mutation confirmed neuroD expression pattern 

thawed and passaged once on feeder cells. The transfected i° neuronal cells demonstrated by in situ hybridization 
cells were trypsinized into single cells, and blastocysts ^ (Example 4) and also demonstrated neuroD expression in 

obtained from C57BU6J mice were injected with approxi- ^^e pancreas and gastiointcstiiial tract 

matcly 15 cells. The injected blastocysts were then Blood glucose levels were detected using PRECISION 

implanted into pseudopregnant mice (C57BL/6JxCBA). QtD blood glucose test strips and a PRECISION QID blood 

Four male chimeras arose from the injected blastocysts glucose sensor (Medisens Inc., Walthajn, Mass.) according 
(AK-71, AK-72, AK-74 and AIC-75). The male chimeras ^ to the manufacturer's instruction. A tissue sample was taken 

AK-71 and AK-72 gave germ-line transmission at a high for DNA analysis and the pups were fixed for further 

rate as determined by the frequency of agouti coat color histological examination. Blood glucose levels in mice 

transmission to their offspring (Fl) in a cross with C57BU6J homozygous for the mutation (neuroD) had blood glucose 

female mice. Since 50% of the agouti coat color offspring levels between 2 and 3 times higher than the blood glucose 

(Fl) should represent heterozygous mutants, their genotypes level of wild- type mice. Heterozygous mutants exhibited 

were determined by Southern blot analysis. Briefly, genomic similar blood glucose levels as wild-type mice. Mice that 

DNA prepared from tail biopsies was digested with Eco RI were homozygous for the mutation (lacking neuroD) had 

and probed with the 1.4 kb 5' genomic sequence used to diabetes as demonstrated by high blood glucose levels and 

make the targeting constmct. This probe detects a 4 kb Eco died by day four; some homozygous mice died at birth. 

RI fragment from the wild-type allele and a 63 kb Eco RI From the foregoing it wiU be appreciated that, although 

fr^ment from the mutant allele, llierefore, a Southern specific embodiments of the invention have been described 

analysis would show a single 4 kb band for a wild-type herein for purposes of illustration, various modification may 

mouse, 4 kb and 6.3 kb fragments for a heterozygous mouse, be made without deviating from the spirit and scope of the 

and a single 6.3 kb band for a homozygous mutant mouse. invention. 



SEQUENCE USIINO 

( 1 )OENEILAL INFORMATION: 

( i 1 i ) NUMBER OF SEQUENCES: 20 

( 2 > INFORMATION FOR SBQ ID NO:l: 

( t ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 20S9 bsK pass 
( B ) TYPE: Bucldc acid 
( C ) JmiANDEDNESS: double 
( D )TOPULOOY:fiBcar 

( i i ) MOLECULE TyPE:cONA 

( V i ) CHUGINAL SOURCE: 

f A ) ORGANISM: Mw muxulu* 



( i X )FEArURE: 

< A ) NAMDKEY: CDS 
( B ) LOCAnON: 229..1303 
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( T i ) SEQUENCE DESOUPnON: SEQID N0:1: 

ACTACOCAOC ACCGAGGTAC ACACACOCCA GCATGAAGCA CTGCGTTTAA CTTTTCCTGO 60 

AOCCATCCAT TTTGCAOTGG ACTCCTOTOT ATTTCTATTT GTOTOCATTT CTGTAGGATT 120 

AGGGAGAOOG AGCTGAAGGC TTATCCAOCT TTTAAATATA GCOOOTGOAT TTCCCCCCCT 180 

TTCTTCTTCT OCTTGCCTCT CTCCCTGTTC AATACAGGAA GTGGAAAC ATG ACC AAA 237 

Met Tht Lyt 
1 

TCA TAC AQC OAO AGC GOO CTO ATG GOC GAG CCT CAG CCC CAA GGT CCC 2B5 

Ser Tyr Scr OJu Scr GJy Lea Mcl Oly OJu Pro Glo Pro Gin Gly Pro 

5 JO 15 

CCA AOC TGO ACA GAT GAG TGT CTC AGT TCT CAG CAC GAG GAA CAC GAG 3 53 

Pfo Ser Trp Tbr Asp Gl« Cy» Lc« S«r Ser Gin A<p Glo Gin His Glu 

20 35 30 35 

GCA GAC AAO AAA CAG CAC GAG CTT GAA CCC ATG AAT GCA GAG GAG GAC 381 

AJa Afp L/s Lys Glu Aap Glu Leu Glu Al< Mel Aaa Ala Glu Ciu Asp 

4 0 4 5 5 0 

TCT CTG AGA AAC GGO GOA GAG GAG GAG GAO GAA GAT GAG GAT CTA GAG 429 

ScT L«u Are Asn Gly Gly Gtu Glu Glu Glu Glu Asp Glu Asp Leu Glu 

5 5 6 0 6 5 

GAA GAG GAG GAA GAA GAA GAG GAG GAO GAG GAT CAA AAG CCC AAG AGA 477 

Gin Glu Glu Glu Glu GlQ Glu Glu Glu Glu A»p Gin Lys Pro Lyt Aig 

7 0 7 5 » 0 

COG GGT CCC AAA AAG AAA AAG ATG ACC AAO GCG CCC CTA GAA COT TTT 523 

Axg Gly Pro Lyg Lys Lys Lys Met Tbr Lys Ala Arg Leu Glu Ars Pbc 

8 5 9 0 9 5 

AAA TTA AGO CGC ATG AAG GCC AAC CCC COC GAG CGG AAC CGC ATG CAC 573 

Lys Lett Arg Arg Met Lys Ala A>n Ala Arg Glu Arg A&n Atg Met His 

100 105 110 lis 

GOO CTO AAC GCG GCC CTG GAC AAC CTG COC AAO GTO GTA CCT TGC TAC 62 1 

Oly Leu Aso Ala Ala Leu Asp Asc Leu Arg Lys Val Val Pro Cys Tyr 

12 0 12 5 13 0 

TCC AAG ACC CAG AAA CTG TCT AAA ATA GAO ACA CTG CGC TTO OCC AAG 669 

S«r Lyi Tbr Oln Lys Leu Ser Lys lie Glu Tbr Leu Arg Leu Ala Lyi 

13 5 14 0 14 5 

AAC TAC ATC TGG OCT CTG TCA GAO ATC CTG CGC TCA GGC AAA AGC CCT 7 17 

Ass Tyr lie Tip Ala Leu Ser Glu lie Leu Arg Ser Gly Lys Ser Pro 

15 0 15 5 16 0 

GAT CTG GTC TCC TTC OTA CAG ACG CTC TGC AAA GGT TTG TCC CAG CCC .765 

Asp Leu Val Ser Pbe Val Gin Thr Leu Cys Lys Gly Leu Sei Gin Pro 

I 6 J 17 0 17 5 

ACT ACC AAT TTG GTC GCC GOC TGC CTG CAG CTC AAC CCT COO ACT TTC 8 13 

Tbr Tbr Asn Leu Val Ala Gly Cys Leu Gla Leu Asa Pro Arg Tbr Pbe 

180 185 190 195 

TTO CCT GAG CAG AAC CCG OAC ATG CCC CCG CAT CTG CCA ACC GCC AGC 86 1 

Leu Pro Glu Gin Asn Pro Asp Met Pro Pto Hit Leo Pro Tbi Ala Scr 

200 205 210 

OCT TCC TTC CCG OTG CAT CCC TAC TCC TAC CAG TCC CCT GGA CTO CCC 909 

Ala Set Pbe Pro Val His Pro Tyx Ser Tyr Gla Ser Pro Gly Len Pro 

315 220 225 

AGC CCG CCC TAC GGC ACC ATG GAC AGC TCC CAC GTC TTC CAC GTC AAG 957 

Sex -Pro Pro Tyr Gly Tbr Mel Atp Ser Ser His Val Pbe His Val Lys 

230 235 240 

CCG CCG CCA CAC GCC TAC AGC OCA GCT CTO GAG CCC TTC TTT GAA AGC 1005 

Pro Pro Pro His Ala Tyr Sei Ala Ala Leu Glu Pro Pbc Pbe Glu Ser 

245 250 255 

CCC CTA ACT OAC TGC ACC AGC CCT TCC TTT GAC GGA CCC CTC AOC CCG 1053 

Pro Leu Tbr As p Cys Tbi Ser Pio Ser Pbe Aip Gly Pro Leo Ser Pro 

260 265 370 275 
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ceo CTC AOC ATC AAT GGC AAC TTC TCT TTC AAA CAC GAA CCA TCC OCC 110 1 

Pro Leu Ser lie Aso Gly Asa Phe Ser Pbe L^s Hit Glu Pro Scr Ala 

380 28S 290 

GAG TTT GAA AAA AAT TAT GCC TTT ACC ATG CAC TAC CCT OCA GCG ACO 1149 
Gin Pfa« Glu Ljs Asn Tyr Ala Pbe Tbr Mot U i a Tyi Pio Ala Ala Tbx 

295 300 305 

CTO OCA OGG CCC CAA AGC CAC GOA TCA ATC TTC TCT TCC OCT GCC OCT 1197 
Leu Ala Gly Pro Glo Ser His Gly Sci lie Pbe Scr Scr Gly Ala Ala 
3 10 3 15 3 2 0 

OCC CCT COC TGC GAG ATC CCC ATA OAC AAC ATT ATG TCT TTC GAT AOC 1245 
Ala Pro Arg C>» OJu lie Pro lie Asp Asn lie Met Ser Pbe A«p Scr 
325 330 335 

CAT TCO CAT CAT GAG CGA GTC ATG AOT OCC CAG CTT AAT GCC ATC TTT 1293 
Hia Ser Hla His Glu Arg Val Met Sex Ala Gla Leu A>a Ala lie Pbe 
340 345 350 355 

CAC OAT TAOAGGOCAC GTCAOTTTCA CTATTCCCGG GAAACOAATC CACTCTGCGT 1349 
H 1 f A t p 

ACAOTGACTO TCCTGTTTAC AOAAGGCAOC CCTTTTGCTA AOATTGCTOC AAAGTOCAAA 1409 

TACTCAAAGC TTCAAGTGAT ATATGTATTT ATTGTCGTTA CTGCCTTTGG AAGAAACAOO 14 6 9 

OOATCAAAGT TCCTGTTCAC CTTATOTATT OTTTTCTATA GCTCTTCTAT TTTAAAAATA 1529 

ATAATACAGT AAAGTAAAAA AGAAAATGTO TACCACGAAT TTCGTGTAGC TGTATTCAGA 1589 

TCOTATTAAT TATCTGATCG GGATAAAAAA AATCACAAGC AATAATTAGG ATCTATGCAA 1649 

TTTTTAAACT AOTAATOOGC CAATTAAAAT ATATATAAAT ATATATTTTT CAACCAGCAT 1709 

TTTACTACCT OTGACCTTTC CCATGCTGAA TTATTTTOTT GTGATTTTGT ACAGAATTTT 1769 

TAATOACTTT TTATAACGTO OATTTCCTAT TTTAAAACCA TGCAGCTTCA TCAATTTTTA 1829 

TACATATCAO AAAAOTAGAA TTATATCTAA TTTATACAAA ATAATTTAAC TAATTTAAAC 1889 

CAOCAGAAAA GTGCTTAGAA AGTTATTGCG TTGCCTTAOC ACTTCTTTCT TCTCTAATTO 1949 

TAAAAAAOAA AAAAAAAAAA AAAAAACTCG AGGGOGGGCC COOTACCCAO CTTTTGTTCC 2009 

CTTTACTOAG GGTTAATTGC GCGCTTOGCG TAATCATGOT CATAOCTOTT TCCTGTGTGA 2069 

ATTGTTATCC GCTCACAATT 2089 

( 2 ) INPORMAnON FOR SEQ ID NO:2: 

< i > SEQUENCE CHARACTERISTICS: 

( A ) LENGTH; 357 amino adds 
( B ) TYPE: atjiino acid 
( D ) TOPOLOGY: liacar 

( t i )MCX£CUL£TYPB:]voicin 

( X i ) SEQUENCE DESCRIPTiaN: SEQ ID NOa: 

Met Tbr Lyi Ser Tyr Ser Glu Ser Gly Leo Met Gly Glu Pio Gla Fro 
1 5 10 15 

Oln Gly Pro Pro Ser Tip Tbr Asp Glu Cys Leu Ser Sci Gin Asp Glo 

2 0 2 5 3 0 

Glu Hia Glu Ala Asp Lys Lys Glu Asp Glu Leu Glu Ala Met Asn Ala 
3 5 4 0 4 5 

0)o Glu Asp Sex Leu Arg AsaGly Gly Glu Glu Glu Glu Glu Asp Glu 
5 0 5 5 6 0 

Asp Leu Gin Glu Glu Glu Glu Glu Glu Glu Glu Glu Olu Asp Gin Lys 
65 70 75 »0 

Pro Lys Arg Arg Gly Pro Lys Lya Lys Lys Met Tbr Lys Ala Arg Leu 

8 5 9 0 9 5 
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Olo Arg Pho Ly » Lcn Arg Arg Met Lyi Ala Ala Ala Aig Olu Arg Aid 

10 0 10 5 110 

Arg Met 11 i» Gly Leu A*n Ala Ala Leu A$p A»n Leu Air Ly» Val V*l 
I J 5 12 0 12 5 

Pro Cy» Tyi Ser Ly* Thr Ola Ly» Leo Ser Lys ll« GIu Thr Leu Arg 
13 0 13 5 14 0 

Leu Ala Ly» Asn Tyr 11c Trp Ala Leo Ser Glu lie Leu Aig Ser Gly 
145 15 0 155 16 0 

Lys Sei Pro Asp Leu Val Ser Pbe Val Gin Thi Leu Cy» Lys Gly Leu 

16 5 17 0 17 5 

Ser Gin Pro Thi Tbr Aid Leu Vtl Ala Oly Cya Leu Ola Leu Aso Pro 

18 0 1 « 5 19 0 

Arg Tbi Phe Leu Pro Olu Gin Asa Pro Asp Mel Pio Pro His Leu Pro 
195 200 205 

Thi Ala Sei Ala Ser Phe Pro Val His Pro Tyi Set Tyr Ola Set Pro 
2 10 2 13 2 2 0 

Gly Leu Pro Scr Pro Pro Tyr Gly Tbr Mei Asp Ser Ser Hii Val Phe 

223 230 233 240 

Bit Val Lys Pro Pro Pro His Ala Tyr Ser Ala Ala Leu Glo Pro Phe 

245 250 255 

Pbe Glu Scr Pro Leu Thr Aap Cys Tbr Ser Pro Ser Phe Asp Gly Pro 

260 265 270 

Leu Scr Pro Pro Leu Ser lie Asn Oly Asn Pbe Sei Phe Lys His Glu 
275 240 285 

Pro Sei Ala Olu Phe Glu Lys Asn Tyi Ala Pbe Thr Met His Tyr Pro 
290 295 300 

Ala Ala Tbr Leu Ala Gly Pro Oln Sei His Oly Sei lie Phe Ser Sci 
305 310 315 320 

Cply Ala Ala Ala Pro Arg Cys Glu lie Pro Me Asp Asn lie Met Ser 

323 33 0 335 

Pbe Asp Ser His Scr His His Olu Arg Val Met Ser Ala Oln Leu Asn 

340 345 350 

Ala lie Phe His Asp 
3 5 5 



( 2 )!NPORM/EnONFORSEQlDN03: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENOIH: 1275 bise pain 
( B ) TYPE: middc acid 
( C ) STRANDEDNESS: ttoubtc 
( D ) TOPOLOGY: linear 

( i i ) MCOBCULB TYPE: cDNA 

( V i ) CHUGINAL SOURCE: 

( A ) OROAMSM: Xeaopus laevis 

( i X ) FEATURE: 

{ A ) NAJvlE/KEY: CDS 
( B ) LOCAHON: 2S..1083 

( X i ) SEQUENCE DESCRIPnON: 5Eg ID NO:3: 

ATTTCCTTTC TCCAGATCTA AAAA ATO ACC AAA TCO TAT OCA OAG AAT GGG 5 1 

Met Thr Lys Ser Tyr Oly Glu Atn Gly 
I 5 

CTG ATC CTG GCC GAG ACT CCG GGC TGC AOA OGA TOG GTG GAC GAA TGC 99 
Leu lie Leu Ala Glu Tbr Pro Gly Cys Arg Gly Trp Val Asp Glu Cys 
10 15 20 2 5 

CTG AOT TCT CAC CAT GAA AAC GAT CTG GAG AAA AAO OAG GGA GAG TTG 147 
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Leo Ser Ser Gin Asp Olu Aia Asp Leo G]u Ly» Lyt Glu Gly Olo Leo 

3 0 3 5 4 0 

ATG AAA GAA CAC GAT OAA GAC TCA CTG AAT CAT CAC AAT GOA GAG GAG 195 

Met Lyi Gio A«p Aip Glu Aip Sei Lev Asn Hi* Hia A»n Gly Olo Gla 

4 5 5 0 5 5 

AAC GAG OAA GAG GAT GAA GGG GAT GAG GAG GAG GAG GAC GAT GAA GAT 243 

Asa Olo Glu Glu A » p Olu Gly Asp Glu Glu Olo Glo A»p Aap Glu Aap 
6 0 6 3 7 0 

GAT OAT GAG GAT GAC GAC CAG AAA CCC AAA AGO COA GOA CCO AAA AAO 291 

Alp A»p Olu Aap Asp A»p O I n Ly» Pro Lys Aig Aig Gly Pro Lys Ly» 

7 5 8 0 8 5 

AAA AAA ATO AGO AAA OCC COG GTO GAG CGA TTT AAA OTG AGA CGC ATO 339 

Ly» Ly» Met Tli r Ly» Ala Arg Val Glu Arg Phe Ly» Val Axg Aig Met 

90 95 100 105 

AAO OCA AAC OCC AGO GAG AGO AAT CGC ATG CAC OCA CTC AAC GAT GCC 387 

Lyi At* Afo Ala Arg Glu Aig A»o Arg Met Hi* Gly Leu Abu Asp Ala 

110 115 12 0 

CTG GAC AGT CTG CGC AAA GTT OTG CCC TGC TAC TCC AAA ACA CAA AAG 435 

Leu Afp Ser Leu Aig Lys Val Val Pro Cys Tyr Ser Lys Tbr Gla Lya 

12 5 13 0 13 5 

TTO TCT AAG ATT GAA ACT CTG CGC CTG GCT AAG AAC TAC ATC TOO GCT 483 

Leu Ser Ly» lie Olu Tbr Leo Arg Leu Ala Ly* Asn Tyr lie Trp Ala 

14 0 14 5 15 0 

CTT TCT OAG ATT TTA AOO TCC OOC AAA AGC CCA GAC CTG GTG TCC TTT 53 1 

Leu Ser Olu lie Leu Aig Ser Oly Lya Ser Pro Asp Leu VaJ Ser Phe 

15 5 16 0 16 5 



GTA CAA ACT CTC TGC AAA GGT TTG TCG CAG CCC ACC ACC AAT CTA GTA 
Val O I n Thr Leo Cyi Lys Gly Leu Ser Gin Pre Tbr T h i Asa Leo Val 
170 175 180 185 



5 7 9 



GCO GGG TOT CTG CAG CTG AAC CCC AGA ACT TTC CTT CCT GAG CAG AGT 627 

Ala Gly Cy« Leu Gin Leu Asa Pro Arg The Phe Leu Pro Olu Gin Scr 

19 0 19 5 2 0 0 

CAG OAC ATC CAG TCG CAC ATG CAA ACA GCG AGC TCT TCC TTC CCT CTG 675 

Ola Aap lie Glo Sei His Met Glo Tbr Ala Ser Ser Scr Pbe Pro Leu 

2 0 5 2 10 2 15 

CAO OGC TAT CCC TAT CAG TCC CCT COT CTT CCC AGT CCC CCC TAT GOT 723 

Oln Oly Tyr Pro Tyr Cln Ser Pro Gly Leu Pre Ser Pro Pro Tyr Gly 

220 225 230 

ACC ATO GAC AOC TCC CAT GTA TTC CAC GTC AAO CCT CAC TCC TAT OOO 771 

Thr Met Aap Ser Scr His Val Pbe His Val Lys Pro His Scr Tyr Gly 

235 240 245 

OCO OCC CTG GAG CCT TTC TTT GAC AGC AGC ACC GTC ACT OAO TGT ACC 819 

Ala Ala Leu Glu Pro Phe Pbe Asp Scr Ser Thi Val Tbr GJu Cy» Thr 

250 255 260 265 

AOC CCO TCA TTC GAT GGT CCC CTG AGC CCA CCC CTT AGT GTT AAT OGG 867 

Ser Pro Ser Phe Asp Gly Pro Leu Ser Pro Pro Leu Ser Val Asa Gly 

270 275 280 

AAC TTT ACT TTT AAA CAC GAG CAT TCG OAG TAT OAT AAA AAT TAC AGO 915 

Asa Pbe Thr Phe Lys His Glu His Ser Glo Tyr Asp Lys Asn Tyr Thr 

285 290 295 

TTC ACT ATG CAC TAT CCT GCA GCC ACT ATA TCC CAG GGC CAC GOA CCA 963 

Phe Thr Mel His Tyr Pro Ala Ala Tbr lie Ser Gla Gly His Gly Pro 

300 305 310 

TTO TTC TCC ACC GCG GGA CCA CGC TGT OAA ATC CCA ATA GAC ACC ATC 1011 

Leu Phe Sci Tbr Gly Oly Pro Arg Cys Glo lie Pro lie Aap Tbr lie 

3 15 320 325 

ATO TCC TAT OAC GGT CAC TCC CAC CAT OAA AGA GTC ATO ACT OCC CAG 1059 

Met Ser Tyx Aip Oly His Scr His His Glu Arg Val Met Scr Ala Gin 

330 335 340 345 

CTA AAT OCC ATC TTT CAT OAT TAACCCTTGG AAGATCAAAA CAACTGACTO 1110 
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Lev Ain Ata lie Pbc His Ajp 

3 3 0 

TOCATTGCCA GCACTGTCTT GTTTACCAAO GGCAGACACG TGGGTAGTAA AAGTGCAAAT 1170 
GCCCCACTCT GGOGCTOTAA CAAACTTGAT CTTGTCCTOC CTTTAGATAT GGGGAAACCT J230 
AATGTATTAA TTCCCACCTC CTTCCAATCG ACACTCCTTT AAATT 1275 

( 2 )INroRMAnONFORSEQIDNO:4: 

( i ) SEQUENCE CHARACTERISTICS: 

( A } LENGTH: 3S2 anuoo vada 
( B ) TYPE : aaptuno and 
( D ) TOPOLOGY: linear 

( t i )M01£CULETyPE:ixocaa 

( X i ) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

M«t Thr Lys Sei Tyi Gly Gla Ain Gly Leu 11c Leu Ala Glu Tbi Pro 
1 5 10 15 

Gly Cys Arg Gly Ttp Val Asp Glo Cys Leu Sex Sei Gin Asij Glu Asn 

2 0 2 5 3 0 

Asp Leu Glu Lyj Lyi Glu Gly Glu Lcd Mei Lys Glu Aip Asp Glu Asp 
3 5 4 0 4 5 

Set L«u Asn Hii Hi» A«n Gly Glw Glu Asa Glu Glu Glu Asp Glu Gly 
5 0 5 5 6 0 

Aap Glu Glu Glu Glu Asp A»p Glu Asp Asp Asp Glu Asp Asp Asp Gin 
65 70 75 80 

Lys Pro Lys Axg Arg Oly Pro Lys Lys Lys Lys Met Thi Lys Ala Aig 

8 5 9 0 9 5 

Val Glu Arg Phe Lys Val Arg Arg Met Lys Ala Asn Ala Arg Glu Aig 

too > 0 3 110 

Asa Arg Mel Hli Gly Leu Aso Asp Ala Leu A»p Ser Leu Arg Lys Val 
115 12 0 12 5 

Val Pro Cyt Tyr Scr Lys Thr Gin Lys Leu Sei Lys lie Glu Thr Leu 
13 0 13 5 14 0 

Aig Leu Ala Lys Asn Tyr lie Trp Ala Leu Sex Glu lie Leu Arg Sex 
145 150 155 .160 

Gly Lys Scr Pro Asp Leu Val Sei Pbe Val Gla Tbr Leu Cys Lys Gly 

16 5 17 0 17 5 

Leu Sex Gio Pro Tbx Tbr A»o Leu Val Ala Gly Cys Leu Gin Leu Asa 

18 0 18 5 19 0 

Pro Arg Tbr Phe Lea Pio Glu Gin Ser Gla Asp lie Gin Sex His Met 
195 200 205 

Ola Tbr Ala Ser Ser Ser Phe Pro Leu Ola Oly Tyr. Pro Tyx Gin Ser 
2 10 2 15 2 2 0 

Pro Oly Leu Pro Sex Pro Pio Tyr Gly Thr Met Asp Ser Sex His Val 
225 230 235 240 

Phe His Val Lyi Pro His Ser Tyx Gly Ala Ala Leu Glu Pro Phe Pbe 

245 250 255 

Asp Ser Ser Thi Val Tbr Glu Cys Thi Ser Pro Ser Pbe Asp Gly Pro 

260 265 270 

Leu Ser Pro Pro Leu Ser Val Asn Gly Asa Phe Thr Pbe Lys His Glu 

275 280 2»5 

His Ser Glu Tyr Asp Lys Asn Tyr Thr Phe Thr Met His Tyr Pio Ala 

290 295 300 



Ala Thr lie Ser Gin Gly Uis Gly Pro Leo Pic Ser Tbr Gly Gly Pro 
305 310 315 320 
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Arg Cy« Glu lie Pro lie Asp Tbr lie Mel Ser Tjrr Asp Gly Hii Ser 

325 330 335 

His Bit Olu Arg Val Met Ser Ala Gla Leu Aio Ala lie Pbe Hii Asp 

340 345 350 



( 2 )lNP0RMAnOKP0RSEQIDNO-J: 

( i ) SEQUENCE CHAJ^CTERJSrnCS: 
( A > LENOni: 7 amino tada 
( D )TYPE: ajuioo acixJ 
( D ) TOPOLOGY: liwJB 

( i i ) MOLECULE TYPE: peptide 

( V ) FRAGMENT TYPE: inlerni] 

( X i ) SEQXJENCE DESCRIPTION: SEQ tD NO:S: 

Ain Ala Arg Olu At$ Arg Arg 

1 5 



( 2 )INPC«MAnONFORSBQlDNO:6: 

( i ) SEQUENCE CHARACTERISTICS! 

( A ) LENGTH: 7 amino acids 
( B } TYPE; snaino acid 
( D )TOPOLOOY: luiear 

( i i ) MOLECULE TYPE: peptide 

( V > FRAGMENT TYPE: itttcmal 

( X i > SEQUENCE DESCRIPTION; SEQ ID NO.-6: 

Aia Olu Arg Glu Arg Aaa Arg 
1 5 



( 2 ) INPORMAnON FOR SEQ ]D NO;7: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 5 axrauj acidi 
( B ) TYPE: amino acid 
( D ) TOPOLOGY: Jiacar 

( i i ) MOLECULE T^TE: pcpddc 

( T )niAOMENrTYPE:iiita'iBl 

( X i )5EQ^JENCEDE9CRIPTrON:SEQlDNO:7: 

AsB Ala Arg Olu Arg 
1 5 



( 2 )INPORMXnC»«FOR5EQIDNO:8: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LBNOTH: 524 base pairt 
( B ') TYPE: miclcic acid 
( C ) STRANDEDNESS: double 
( D ) TOPOLOGY: liaea 

( i i } MOUoCULB TYPE: DNA (gcfiomw) 

( V i ) ORIGINAL SOURCE; 

( A ) ORGANISM: Hcno aqxiesa 

( ir i i ) IMMEDIATE SOURCE: 
( B )CLONE:9Fl 

( i z > FEAIXmE: 

( A ) NAMEOOJY; CDS 
( B ) LOCAnON: 57.^24 



( X i ) SEQUENCE DESCRIPnON: SEQ ID NO:S: 
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TTTTTCTGCT TTTCTTTCTG TTTOCCTCTC CCTTGTTOAA TGTACGAAAT CGAAAC 56 

ATG ACC AAA TCO TAC AOC GAG ACT OGO CTO ATG GGC GAG CCT CAG CCC 104 
M«i Thr Lyf Ser Tyr Ser Glu Ser Oly Leu Mec Gly Olu Pro Gin Pro 
I 5 10 15 

CAA GOT CCT CCA AGO TGG ACA OAC GAG TGT CTC AGT TCT CAG GAC GAG 152 
Gin Gly Pio Pro Sci Tip Thr A»p Glu Cys Leu S«r Scr Gin Asp Glu 

2 0 2 5 3 0 

GAG CAC GAG GCA GAC AAG AAG GAG GAC GAC CTC OAA GCC ATG AAC OCA 200 
Glu Hi» Glu Ala Asp Lys Lya Glo Asp Asp Leu Glu Ala Met Asn Ala 
3 5 ♦ 0 4 3 

GAG GAC OAC TCA CTG AGG AAC GCG GGA GAG GAG GAG GAC GAA GAT GAG 248 
Glu Oltt Alp Sor Lou Aig Asn Gly Oly Glu Olu Olu Asp Glu Asp Glu 
5 0 5 5 6 0 

GAC CTG GAA GAG GAG GAA OAA GAG GAA CAG GAG OAT GAC GAT CAA AAG 296 
Atp Lev Olu Olu Olu Glu Olu Glu Olu Civ Glo Asp Asp Asp Glo Lys 
65 70 75 80 

CCC AAG AOA COC OGC CCC AAA AAO AAO AAO ATG ACT AAG GCT CGC CTG 344 
Pro Ly« Aig Arg Gly Pio Lys Ly» Lys Lys Met Thr Lys Ala Arg Leu 

8 5 9 0 9 5 

GAG CGT TTT AAA TTG AGA CGC ATG AAG OCT AAC GCC COG GAG CGG AAC 392 
Glu Afg Pbe Lys Ltu Ajg Arg Met Lys Ala A»n Ala Arg Glu Aig Asd 

10 0 .10 5 110 

CGC ATG CAC GGA CTG AAC GCG GCG CTA CAC AAC CTG CGC AAG GTG GTG 440 
Aig Met His Gly Leu Asa Ala Alo Leu Asp Asn Leu Arg Lys Val Vol 
115 12 0 12 5 

CCT TGC TAT TCT AAG ACG CAG AAG CTG TCC AAA ATC GAG ACT CTG CGC 488 
Pio Cy I Tyr Set Lys Thr Gin Lys Leu S«i Lys lie Glu Thr Lev Aig 
13 0 13 5 14 0 

TTG OCC AAO AAC TAC ATC TOG GCT CTG TCG GAG ATC 524 
Leo Ala Lys Asa Tyr lie Txp Ala Leu Sei Glu Me 
14 5 15 0 15 5 

( 2 ) INFORMATION FOR SEQ ID NO:9: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 156 amino adds 
( B ) TYPE: amiflo acid 
( D ) TOPOLOGY: UficAt 

{ i i ) MOLECULE TYPE: protein 

( X i ") SEQUENCE DESCRIPTION: SEQ ID NO:9: 

Mel Tbr Lys Scr Tyr Sei Olu Sei Gly Len Met Gly Glo Pro Gin Pio 
1 5 10 15 

Ola Gly Pro Pro Scr Trp Tbr Asp Glu Cys Leu Ser Sci Gin Asp Glu 

2 0 2 5 3 0 

Glo His Glo Ala Asp Lys Lys Gin Asp Asp Lea Glu Ala Met Asn Ala 
3 3 4 0* 45 

Glu Glu Asp Ser Lev Aig Asn Gly Gly Glo Glo Olv Alp Glv Asp Glv 
5 0 5 5 6 0 

Asp Leo Olu Glu Gin Glu Glv Olv Glv Glv Glv Asp Asp Asp Gin Lys 
65 70 75 «0 

Pro Lys Arg Aig GJy Pio Lys Lys Lys Lys Met Thr Lys Ala Arg Lev 

8 5 9 O .95 

Glu Arg Phc Lys Leu Arg Arg Met Lys Ala Asn Ala Arg Olv Arg Aso 

10 0 10 5 110 

Arg Met His Gly Leu Asn Ala Ala Lev Asp Asn Leu Arg Lyi Val Val 
115 12 0 12 5 

Pro Cys Tyi Sii\J Lys Thr Gin Lys Leu Ser Lys lie Glu Thr Leu Arg 
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13 0 13 5 14 0 

Leu Ala Lys Aso Tyr 11c Trp Ala Leu Ser Glu lie 
14 5 ISO 155 

( 2 > INPORMAnON FOR SEQ ID NO:10: 

( i ) SEQUENCB CHARACTERISTICS: 

( A ) LENCnH: 1352 ba5e pairs 
( B )TYPe: nucleic adsi 
( C ) STRANDeDNESS: doubU 
( D ) TOPOLOGY: luie« 

( i i ) MOLECULE Ty7I::DNA(geooiiuc) 

( V i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Ilcmo sapiens 

{ V i i ) IMMEDIATE SOURCE: 
( B ) CLOI^JE: 1401 

( i X ) FEATURE: 

( A ) NAM&KEY: CDS 
( B ) LOCAIION: 55.. 1194 

( X i ) SEQUENCE DESCRJPnON: SEQ ID NO:10: 

CCCCTCACTT TGTGCTGTCT GTCTCCCCTT CCCGCCCOGG GNCCCTCAGG CACCATGCTO 60 

ACCCOCCTOT TCAGCGAGCC CGGCCTTCTC TCGGACOTGC CCAAGTTCGC CAGCTGGGCC 120 

OACGOCGAAO ACGACGAGCC GAGGAGCGAC AAGGOCGACC CGCCGCCACC GCCACCOCCT 180 

OCOCCCGGGC CAGOGOCTCC GGGGCCAGCC COGGCGOCCA AGCCAGTCCC TCTCCCTOGA 240 

GAAGAGGOGA CGGAGGCCAC GTTGGCCGAG GTCAAGOAGG AAGOCGAGCT GGGGGGAGAG 300 

GAGGAGGAGG AAGAGGAGGA GGAAGAAGGA CTGGACGAGG CGGAGGGCGA GCGGCCCAAG 360 

AAOCOCGOGC CCAAGAAGCG CAAGATGACC AAOOCOCOCT TOOAOCOCTC CAAOCTTCQG 420 

COGCAOAAGG COAACOCGCG GGAGCGCAAC CGCATOCACG ACCTGAACGC AQCCCTGOAC 480 

AACCTOCOCA AOGTOOTGCC CTGCTACTCC AAGACGCAOA AGCTGTCCAA GATCGAGACG 540 

CTOCOCCTAG CCAAOAACTA TATCTGOGCG CTCTCOOAGA TCCTOCGCTC CGGCAAOCGG 600 

CCAOACCTAG TGTCCTACGT GCAGACTCTG TGCAAGGGTC TGTCGCAGCC CACCACCAAT 660 

CTOOTOOCCO GCTGTCTGCA GCTCAACTCT CGCAACTTCC TCACGGAGCA AGGCCGCGAC 720 

GOTOCONNCC GCTTCCACGG CTCGGGCOGC CCGTTCGCCA TGCACCCCTA CCCGTACCCG 780 

TOCTCOCGTO OCGGGCOGAC AGTGCCAGGC GCGGCGGCCT GGGCOGCGGC CGGCGCACGC 840 

CTOCOOACCC ACGGCTACTO CGCCGCCTAC GAOACGCTOT ATGCOGCGGC AGGCGGTGOC 900 

GOCGCGAGCC CGOACTACAA CAGCTCCGAG TACGAOGGCC CGCTCAGCCC CCCOCTCTGT 960 

CTCAATOGCA ACTTCTCACT CAAGCAOGAC TCCTCGCCCG ACCACGAGAA AAGCTACCAC 1020 

TACTCTATOC ACTACTCOQO CTOCCCNOOT TCOCGCCACO GNCACGGGCT AGTCTTCOOC 1080 

TCOTCGOCTG TOCOCGOGQG COTCCACTCO OAOAATCTCT TOTCTTACGA TATGCACCTT 1140 

CACCACOANC GGOQCCCCAT GTNCNAGOAO CTCAATOCGT TTTTTCATAA CTGAGACTTC 1200 

aCOCCONCTC CCTNCTTTTT CTTTTGCCTT TGCCCOCCCC CCTGTCCCCA GCCCCCAGAG 1260 

COCAOGGACA CCCCCATNCT ACCCCGGCNC CGGCGGAGCO GGCCACCGGT CTGCCOCTCT 1320 

CCTOGOGCAG CGCAGTCTGT TACNTOTOOT go 1352 

( 2 )INK)RMAnONFORSBQlDNO:ll: 

( i ) SEQUENCE CHARACrEWSncS: 

( A ) LENGIH: 379 asino acids 
( D ) TYTC: amino acid 
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( D )T0POlj0GY;Uacar 

{ i i )MOLECUl.JBTYPB:iTW«n 

( V i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Hooio ajfneat 

{ X i ) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Leu Thr Aig Leu Phe Ser Olu Pro Gly Leu Leu Ser A»p Val P 



r o 



1 



Ly* Pbo Ala Sex Trp Gly 

2 0 



1 0 



! 5 



Alp Oly 01 u A«p Asp Glu Pfo Arg Ser Asp 



2 5 



3 0 



Lyi Oly Alp Ala Pio Pio 

3 5 



Pro Pro Pro Pro Ala Pro Oly Pro Gly Ala 



4 0 



4 5 



Pro Oly Pro Ala Arg Ala 
S 0 



Ala Lys Pio VbJ pto Leu Arg Oly Olu Olu 



5 5 



6 0 



Gly Thr Glu Ala Tbr Leu 

6 5 7 0 

Oly Olu Glu Glu Olu Glu 

6 5 

Olo Oly Olu Arg Fio Lys 

1 0 0 

Ly* Ala Arg Leu Olu Aig 
I 1 S 

Ara Glu A I s A 9 n Ax g Met 
1 3 0 



Ala Olo Val Lyt Glu Glu Gly Glu Leu Oly 



Olu Glu Glu 



Lys Arg 



Ser 



H i 3 
1 3 5 



L y a 
1 2 O 



G 1 y 
1 0 5 



G 1 u 

9 0 

Pio 



Leu A I g 



Ajp Leu Aja 



7 5 



8 0 



Glu Gly Leu Asp Glu Ala 

9 5 

Lys Lys Arj? Lys Met Tbr 

1 1 0 

Aig GlD Lys Ala Asa Ala 
1 2 S 

Ala Ala Leu Asp Asa Leu 
1 4 0 



Arg Lye Vol VaJ Pro Cys Tyr Scr Lyt 

14 5 15 0 

Olu Tbr Leu Arg Leu Alo Ly* A«n Tyr 

1 6 5 

Leu Arg Ser Gly Lyj Arg Pro Asp Leu 

1 & 0 1 g S 



Cys Lys Gly Leu Ser Olo 
1 9 5 

Gin Leu Asa Ser Arg Asa 
2 10 

Xaa Arg Phe His Gly Ser 

2 2 5 2 3 0 



P r o 



Phe 
2 1 5 



T b r 
2 0 0 



G I y 

Tyr Pro Cys Ser Arg Gly Gly Arg Tbr Val 

2 4 5 2 5 0 



Ala Ala Ala Gly Ala Arg Leu Aig Tbr 

2 « 0 2 6 5 



Olu Tbr Leu Tyr Ala Ala 
2 7 5 

ASD Ser Sei Glu Tyr Olu 
2 9 0 

Oly Asa Plie Ser Leu Lys 

3 0 5 3 10 

Tyr Hi* Tyr Ser Met His 

3 2 5 



A 1 a 



O 1 y 
2 9 5 



G I y 
2 8 0 



Tyr Ser Oly 



Bis Oly Leu Val Pfac Oly Ser Scr Ala 

3 4 0 3 4 5 



Tbr 



I 1 « 
I 7 0 



Tbr A » n 



Leu T b I Olu 



Oly Pro Pie 



Gly Gly 



Pro Leu Scr. 



Ola Asp Ser Ser 



Cys 
3 3 0 



Glu Aso Leu Leu Scr Tyr Asp Met His Leu 
3 5 5 3 6 0 



Ola Lys Leu 
1 5 5 

Trp Ala Leu 



Val Set Tyr Val 



Leu Val Ala 
2 0 5 

Ola Gly Arg 
2 2 0 

Ala Met His 
2 3 3 

Pro Gly Ala 



His Gly Tyr Cys 



Gly Ala Scr 

2 8 5 

Pro Pro Leu 

3 0 0 

Pro Asp Hit 
3 1 S 

Pro Oly Ser 



Val Arg Gly Gly 



n'l a His Xaa 

3 6 5 



Set Lys lie 
1 6 0 

S e I Glu lie 
1 7 5 

Gin Thr Leu 
1 9 0 

Gly Cys Leo 



Asp Gly Ala 



Pro Tyi Pio 
2 4 0 

Ala Ala Tip 

2 5 5 

Ala Ala Tyr 
2 7 0 

Pro Asp Tyr 



Cys Leu Asa 



Olu Lys Scr 
3 2 0 

Aig His Oly 

3 3 5 

Val His Ser 

3 5 0 

Arg Gly Pro 



Met Xaa Xaa Glu Leu Asa Ala 

3 7 0 3 7 5 



Pbo Pbe His Asn 
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( 2 )lNroRMJCnON POR SEQ ID NO:12: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 310 base poas 
( B )'n'FE: nucleic ackd 
( C ) SITIANDEDNESS: double 
{ D ) TOPOLOGY: tocar 

( i i ) MOUCULE TYPE: DNA <e«icnruc) 

( V i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Hoao atptao 

( T i i ) IMMEDIXTB SOURCE: 
( B ) CLONE: 2aA] 

( i X ) FEATURE: 

( A ) NAMB/KEY: CDS 
( B )LOCAnON: 1.310 

(1 i ) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

CCCCOGCGTN CTGAGOTCCA GGGGCACAGG ACGACGAOCA OGAGAOGCGG COOCGOCCGG 60 

ACOCONOTCC CTCCOAGGCG CTGCTGCACN CGCTGCGCAG GAGCGGCGCG TCAAGGCCAA J 20 

COATCGCGAO CGCAACCGCA TOCACAACTT GAACGCOOCC CTGOACGCAC TGCGCAGCOT ISO 

OCTOCCCTCG TTCCCCGACO ACACCAAGCT CACCAAAATC GAGAGCCTGC OTTNCGCCTA 240 

CAACTACATC TOGGCTCTOO CCGAOACACT GCGCTOGCOG ATNAAGGOCT GCCCOGAGGC 300 

OOTOCCCOOO 3 J 0 

( 2 )INPORMAnON FOR SBQlDNO:13: 

( i ) SEQUENCE CHARACTERISTTCS: 

( A ) LENGTH: 103 aniiflo acids 
( fi ) TYyB: mnao acid 
( D ) TOI-OLOGY: lines' 

( i i ) MOLBOJLE TYPE: rrocein 

( V i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Hcxdo asfAcas 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

PlO Oly V«l Ltv Arg Ser Arg Gly Thi Oly Arg Arg Ala Gly Glu Ala 
1 5 10 15 

Ala Ala Ala Gly Arg Xaa Set Leu Arg Gly Ala Ala Ala Xaa Ala Ala 

2 0 2 5 3 0 

Ola Olu Arg Arg Val Lys Ala Asd Asp Arg Gin Arg Asn Arg Met His 
3 5 AO * 5 

Atn Leu A*o Ala Ala Leu Aip Ala Leu Arg Ser Val Leu Pro Ser Phe 

3 0 5 5 6 0 

Pro Alp Alp Tbr Lyi Leu Thr Lys lie Glo Sei Leu Arg Xaa Ala Tyr 
65 70 75 80 

Aia Tyx lie Tip Ala Leu Ala Glu Tbr Leu Arg Trp Atg Xaa Lys Oly 

8 5 9 0 9 5 

Cyi Pro Olu Ala Val Pro Gly 

1 0 0 

( 2 ) INFORMATION POR SBQ ID H0:14: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 1560 base pain 
( B )Tyrc: Bodtic acid 
( C ) STRANDELWESS: ckjubk 
{ D )TOT«.OGY: Unev 



49 



5,695,995 

-continued 



50 



( i i > MOLECULE TYPE: cDNA 

( V i ) OFUGINAL SOURCE: 

( A ) <XtCANISM: Homo tapieos 

( V i i ) tMMEDI/J3 SOURCE: 

( D } CLONE: HC2A 

( i X ) FEATURE: 

( A ) NAME/KEY: CDS 
( B ) L0CA310N: 57.. 11 26 

( X t ) SEQLrENCE DESCRIPTION! SE<? ID NO:l4: 



TTTTTCTGCT TTTCTTTCTG TTTGCCTCTC CCTTCTTOAA TOTAGGAAAT CGAAACATGA 60 

CCAAATCGTA CAGCOAGAGT GOGCTGATOG OCOAOCCTCA GCCCCAAGGT CCTCCAAOCT 120 

CGACAGACOA GTGTCTCAGT TCTCAGGACG AOGAGCACOA GGCAGACAAO AAGGAGGACG J SO 

ACCTCOAAOC CATGAACGCA OAGGAOOACT CACTGAOGAA COOGGGAGAG GAOGAGGACG 240 

AAGATGACCA CCTGGAAOAG GAGOAAGAAG AGGAAGAGGA GGATGACOAT CAAAAGCCCA 300 

AOAOACOCOO CCCCAAAAAG AAGAAGATGA CTAAGGCTCG CCTGOAOCOT TTTAAATTGA 3€0 

GACGCATGAA GOCTAACOCC CGGOAOCaOA ACCGCATGCA CGGACTOAAC GCGGCGCTAG 420 

ACAACCTGCG CAAGGTGGTG CCTTOCTATT CTAAGACOCA OAAOCTOTCC AAAATCGAGA 480 

CTCTGCOCTT OGCCAAGAAC TACATCTOGO CTCTGTCOGA GATCCTGCGC TCAOOCAAAA 540 

GCCCAOACCT GGTCTCCTTC OTTCAGACGC TTTGCAAQGG CTTATCCCAA CCCACCACCA 600 

ACCTCGTTGC GGGCTGCCTG CAACTCAATC CTCGGACTTT TCTGCCTGAG CACAACCAOG 660 

ACATGCCCCC GCACCTGCCG ACOGCCAGCO CTTCCTTCCC TGTACaCCCC TACTCCTaCC 720 

AGTCGCCTGG GCTGCCCAGT CCGNCTTACG GTACCATGGA CAGCTCCCAT GTCTTCCACG 7gO 

TTAAOCCTCC GCCCCACGCC TACAGCGCAO CGCTGGAGCC CTTCTTTGAA AGCCCTCTGA 840 

CTGATTGCAC CAGCCCTTCC TTTGATGOAC CCCTCAGCCC CCCGCTCAGC ATCAATGGCA 900 

ACTTCTCTTT CAAACACCAA CCGTCCGCCG AGTTTGAOAA AAATTATGCC TTTACCATGC 960 

ACTATCCTOC AOCGACACTO G.CAGGOGCCC AAAGCCACCG ATCAATCTTC TCAGGCACCG 1020 

CTOCCCCTCG CTOCGAGATC CCCATAOACA ATATTATOTC CTTCGATAGC CATTCACATC 1080 

ATGAOCOACT CATGAGTGCC CAGCTCAATG CCATATTTCA TGATTAGAOO CACOCCAGTT 1140 

TCACCATT1C CGGGAAACGA ACCCACTGTG CTTACAGTGA CTGTCGTGTT TACAAAAOGC 1200 

AGCCCTTTGG TACTACTGCT GCAAAGTGCA AATACTCCAA GCTTCAAGTG ATATATCTAT 1260 

TTATTGTCAT TACTGCCTTT GGAAGAAACA GGGGATCAAA GTTCCTGTTC ACCTTATOTA 1320 

TTATTTTCTA TAGACTCTTC TATTTTAAAA AATAAAAAAA TACAGTAAAO TTTAAAAAAT 1380 

ACACCACGAA TTTGGTGTGG CTGTATTCAO ATCGTATTAA TTATCTGATC GGGATAACAA 1440 

AATCACAAGC AATAATTAOO ATCTATOCAA TTTTTAAACT AGTAATGGCC CAATTAAAAT 1500 

ATATATAAAT ATATATTTCA ACCAOCATTT TACTACTTGT TACCTCCCAT GCTGAATTAT 1560 



( 2 ) INPORMAnON FOR SEQ JD KO:15: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 356 amino acids 

( B ) TYPE: mniijo acid 
< D ) TOPOLOGY: lijicar 

( i 1 ) MOLECULE TYPE: protdn 

( V 5 )<:MUGIKAL SOURCE: 

( A } GROAtnSM: Ikmo Mpieiu 
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( X i } SEQUENCE DESCRIPTION: SEQ ID NO: IS: 

M«t Tbi L y t Scr T y t Ser CIu Set Gly Leu Met Cly Glo Pro Gin Pro 
1 S 10 15 

Gin Cly Pro Pro Ser Trp Tbr Asp Glu Cys Leu Sci Ser Ola A*p Clu 

3 0 3 5 3 0 

Olu His Glu Ala Asp Lya Lys Glu Asp Asp Leu Olu Ala Met Asn Ala 
3 5 4 0 4 5 

Olu Glu Asp Scr Leu Arg Asa Gly C)y Olu Olu Olu Asp Ola Asp Glu 
5 0 5 5 6 0 

Asp Leu Glu Clu Clu Glu Olu Glu Olu Glu Glu Asp Asp Asp Gin Lys 
65 70 75 80 

Pro Lya Arg Aig Gly Pro Lys Lys Lys Lys Mei Tbi Lys Ala Arg Leu 

8 5 9 0 9 5 

Glu Are Pbe Lys Leu Arg Arg Met Lys Ala Asn Ala Arg Olu Arg Asn 

10 0 10 5 110 

Arg Met His Gly LcU Atu Ala Ala Leo Asp Asn Leu Arg Lys Val Vai 
115 12 0 12 5 

Pro Cys Tyr Ser Lys Tbr Ola Lys Leo Sei Lys lie Glu Tbi Leu Arg 
13 0 13 5 14 0 

Lc-a Ala Lys Asn Tyr lie Trp Ala Leu Ser Glu lie Leu Arg Scr Gly 
J45 150 155 160 

Ly* Ser Pro Asp Leu Val Ser Pbe V«l Gin Tbr Leu Cys Lys Oly Leu 

16 5 17 0 17 5 

Ser Gin Pro Thi Thr Asn Leu Val Ala OJy Cys Leu Gin Leu Asn Pro 

18 0 18 5 19 0 

Alg Thr Pbe Leu Pro Glu Gin Asn Gin Asp Met Pro Pro His Leu Pro 

195 200 205 

Thr Ala Ser Ala Ser Pbe Pro Val His Pro Tyr Scr Tyr Oln Ser Pro 
2 10 3 15 2 2 0 

Gly Leu Pro Ser Pro Xaa Tyr Oly Tbr Mel Asp Ser Ser His Val Pbe 
225 230 235 240 

His Val Lys Pro Pro Pro His Ala Tyr Scr Ala Ala Leu Olo Pro Ptie 

245 25Q 255 

Phe Olu Ser Pro Leu Tbr Asp Cys Thr Ser Pro Ser Pbe Asp Gly Pro 

260 265 270 

L«v Ser Pro Pro Leu Sei lie Asn Gly Asn Pbe Ser Plie Lys His Glu 
275 2S0 285 

Pro Ser Ala Glu Pbe Glu Lys Asn Tyr Ala Pbe Thi Met His Tyr Pro 
290 295 300 

Ala Ala Tfar Leu Ala Oly Ala Gin Ser His Gly Ser lie Pbe Ser Gly 
305 310 315 320 

Thr Ala Ala Pro Arg Cys Glu lie Pro lie Asp Asn lie Mel Ser Phe 

325 330 335 

Asp Sex His Scr His Uis Glu Arg Vat Met Ser Ala Gin Leu Asa Ala 

340 345 350 

lie Phe His Asp 
3 5 5 



( 2 ) INPC»MAriON FOR SEQ ID NO:16: 

( i ) SEQUENCE CHAJL'iCTERlSTICS: 
( A ) LENGTH: 1462 base pairs 
( B ) TYPE: micldc acid 
( C } STRANDEDNESS: single 
( D ) TOPOLOGY: liacsr 
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( ) i )M(».BCUtETYPE:cDNA 

( V t ) ORIGINAL SOURCE: 

( A ) QRCAhflSM: Mus nnuculus 

C T i i ) IMMEDIArE SOURCE: 
{ B ) CLONE: l.l.l 

( i X ) FEATURE: 

( A ) NAME/KEY: CDS 
( B ) UXTAnON: 23I..n01 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GAATTCAAGC TAOAGGCTGG TACCCCGCCT GGTAGAGATG CCACACTCGC TCCGCGGCTC 60 

OCATOGCGCT CTGAAGACGC CGOCGCCCGC COCCTTGAOG AACCGCTGCC CCCGCTCCCT 120 

OAAOATOQOO OAACAATOAA ATAAGCGAGA AGATTCCTCT TCTCCCCCCT CTCTCTCTTG 180 

CCCCCTCCCC CCTCCCCTCC CCTCTCCCCT TGACTCCTCT CTGAOOCACC ATCCTGACCC 240 

GCCTGTTCAG CGAQCCCQOC CTCCTCTCGO ACOTGCCCAA GTTCOCCAOC TGGGGCGACG 300 

GCGACGACCA CGAGCCGaGG AGCOACAAGG GCGACGCCCC GCCGCAGCCT TCTCCTOCTC 360 

CCOGOTCOOG GGCTCCAGGA CCCGCCCGGG CCGCCAAGCC AGTGTCTCTT CCTOGAGOAO 420 

AAGAGATCCC TGAACCCACG TTGGCTOAOG TCAAOOAOGA AGGAGAGCTG GGCGGCGAGG 480 

AOGAGGAGGA AGAGGAGGAG GAGGAAGOAC TGOACOAGGC GGAAGGCGAG CGGCCCAAOA 540 

AOCGCGGGCC GAAGAAACGC AAGATGACCA AOOCGCGTCT GGAGCGCTCC AAOCTGCOGC 600 

OACAGAAOGC CAATGCGCGC GAGCGCAACC OCATGCACGA CCTOAACGCG GCTCTGGACA 660 

ACCTOCOCAA GOTGOTCCCC TGCTACTCCA AGACCCAOAA GCTOTCCAAO ATCCAGACCC 720 

TGCOCCTGGC CAAOAACTAC ATCTOGGCrC TCTCGGAGAT CTTGCGCTCC GGGAACCCCC 780 

CGGATCTGGT GTCCTACGTG CAOACTCTGT GCAAGGGGCT GTCACAGCCC ACCACGAATC 840 

TGOTGCCCGO CTGCCTGCAG TTAAACTCTC GTAACTTCCT CACGGAGCAG GGCGCGGACO 900 

GCOGCCGCTT TCACOGCTCG GOTGOCCCQT TCGCCATOCA TCCGTACCCA TACCCGTGCT 960 

CCCGCCTGGC AGGCCACAGT GTCAGGCGGC TQOCOGCCTG GGCGOAGGNC GGCGCACGCC 1020 

TOCGOACCCA CGGCTACTGC GCCGCCTACG AGACOCTOTA CGCOCCGGCC GCTGGCGGCO 1080 

QCGCTAOCCC GGACTACAAC AGCTCCGAGT ACGAGGGTCC ACTCAGTCCC CCGCTCTGTC 1140 

TCAACGGCAA CTTCTCGCTC AAOCAGOACT CGTCCCCCGA TCACGACAAG AGCTACCACT 1200 

ACTCTATOCA CTACTCGCGC TGCCCNGGCT CACGCCACGG NCACGGOCTC OTCTTCGGCT J 260 

CGTCOOCCOT OCGCGQOOGC GTCCACTCCG AGAATCTCTT GTCTTACGAT ATGCACCTTC 1320 

ACCACOATCG GGGCCCCATG TACGAGGAGC TCAACGCATT TTTCCATAAC TOAGaCCTCN 1380 

COCCGACCCC TTCTTTTTCT TTQCCTTNNT CCGOCCCCTT AGCCCCANCC CCAANANCTC 1440 

ACGNNTCCCA CCGATCTCCA GG 1462 

( 2 )INK)RMAnONFORSEQlDNO:n: 

( I ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 380 amino aciis 
{ S )TYPE: amino ocid 
( D ) TOrOUXJY; linear 

( i 1 ) MOLECULE TyPB:pia4nn 

< V i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Mni nuunilus 

( X j > SEQUENCE DESCRIFnON: SEQ ID NO:l7: 

M«t Leu Tbr Arg Leu Pbe Set GIu Pro Q I y Leu Lett Ser Asp Val Pro 
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Lys Pbe Aid Set Trp Cly Atp G I y Asp A*p Aap Clu Pro Arg Set A»p 

2 0 2 S 3 0 

Lya Oly Asp Ala Pib Pto Gin Pro Scr Pro A)a Pro Cly Set Cly Ala 
3 5 4 0 4 5 

Pzo Oly Pro Ala Arg Al& Ala Lys Pro Val .Scr I, eu Arg Oly Cly Clu 
5 0 5 5 6 0 

Clu lie Pio GIu Pro Tbr Leu Ala Clo Val Lys Glu Clu CJy Ciu tcu 
65 70 75 &0 

Gly Cly Clu Clu Glu Clu Clu Olu Clu Clu Clu Clu Cly Leo A»p Clu 

8 5 9 0 9 5 

Ala Glu Gly Glu Arg Pro L/9 Lys Aig Gly Pro Lys Lys Azg Lyt Met 

10 0 10 5 110 

Tlir Lys Ala Arg Leu Glu Arg Sei Lys Leu Arg Arg Gin Ly^ Ala Asa 
lis 120 125 

Ala Azg Gtu Axg Asn Acg Met His Atp Leu Asa AI& Ala Leu Asp Aszt 

1 3 0 1 3 5 1 4 0 

Len Afg Ly» Val Val Pro C y s lyt Ser Lys Tbr Olit Lys Leu Ser Lys 
145 150 155 160 

lie Glu Tbi Leu Arp Leu Al» Lys Aio Tyr lie Trp A1» Leu Set Glu 

] 6 J 17 0 17 5 

lie Leu Aig Scr Gly Ly> Arg Pro Ajp Leu Val Ser Tyr Val Gin Tbr 

18 0 18 5 19 0 

Leu Cys Ly« Cly Leu Ser Glu Pio Tbt Tbr Asa Leu Val A1& Gly Cys 

195 200 205 

Leu Gin Leu Asn Ser Arg Asq phc Leti Tbr Glu Clo Gly A]a A»p Oly 

2 10 2 15 2 2 0 

Oly Arg Pbe His Gly Ser Oly Gly Pro Phe Al* Met Hi> Pre Tyr Pro 
325 230 235 240 

Tyr Pro Cya Scr Arg Lev Ala Cly Bis Sei Val Arg Arg Leu Ala Ala 

245 250 255 

Trp Ala Olo Xaa Gly Ala Arg Leu Arg Tbr His Gly Tyr Cys Ala Ala 

260 265 270 

Tyi Clu Ttr Leu Tyr Ala Ala Ala Gly Oly Gly Gly Ala Ser Pro Asp 
27 5 280 285 

Tyc Asn Ser Ser Glu Tyr Glu Gly Pio Leo Ser Pro Pro Leu Cys Lcu 
290 295 300 

Aan Gly Asn Phe S«c Leu Lys Gin Asp Ser Set Pro Aep His Olv Lys 
305 310 315 320 

Set Tyi His Tyj Ser Met His Tyi Set Arg Cyi Pro Oly Scj Aig His 

325 330 335 

Gly Hit Gty Leu Val Pbe Oly Ser Ser Ala V«l Arg Gly Gly Val Hi> 

340 345 350 

Scr ClD Aao Leo Leu Scr Tyz Aip Mer iJia Leu Hi* His Asp Arg Gly 
355 360 365 

Pro Met Tyr Glu Olu Leu Asn Ala Pbe Pbe His Asa 

370 375' 380 

( 2 ) INPORMAnON FOR SBQ ID NO.IS: 

( i ) SBQUENCF- CHABACrEWSnCS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
( C ) STOA>nDEDNESS: single 
(D ) TOPOLOGY: fiacar 
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( i i ) MOLECULE TYPE: cDMA 

( T i i ) IMMEIXATE SOURCE: 
( B ) CLONE: JL34 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CTCAOCATCA GCAACTCGGC 



2 0 



{ 2 ) INPORMAnON FOR SEQ ID NO:19: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 29 bwe pairs 
( B )TyPE:auckicaci(l 
( C ) STRANDBDNESS: single 
( D ) TOPOLOGY: lioew 

( I i ) MOLECULB TYPE: cDSA 

( V i i ) IMMEDIATE SOURCE: 
( B ) CLONE: JL36 

( X i ) SEQUENCE DESCRIPTION: SEQ ID NO: 19; 

TCOOATCCCO TTCTAGCCGC CCCTTGCTC 



2 9 



( 2 ) INPORMAnON FOR SEQ ID NO20: 

( i ) SEQUENCE CllARACrEWSTTCS: 
( A ) LEKOTH: 21 Uic jxars 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: slngJc 
< D ) TOPOtOOY: B«or 

( i i ) MOLECULE TYPE; cDNA 

( Y i i ) IMMEDLNTE SOURCE: 
( B ) CLONE: SIAO 

( X i ) SEQUENCE DESCRIPHW: SEQ ID NO:30: 

OTTTTCCCAO TCACGACOTT G 2 1 



The embodiments of the invention in which an exclusive 
property or privilege is claimed arc defijied as follows: 

1. An isolated nucleic acid molecule which hybridizes 
under stringent conditions with a nucleic acid molecule 
selected from among SEQ ID NO: 1 , SEQ ID NO:3, SEQ ID 
N0:8, SEQ ID NOrlO, SEQ ID N0:12, SEQ ID NO:14, 
SEQ ID NO: 16, and complements thereof. 

2. A vector comprising in serial array a promoter, and the 
nucleic acid molecule of claim 1. 

3. A cell in culture transformed by the nucleic acid 
molecule of claim 1. 

4. A method for inducing differentiation of a non- neuronal 
cell in culture into a neuron, comprising introducing a 
nucleic acid moiecuie of c3aim 1 into the non-neuronal celL 



^ 5. An isolated nucleic acid molecule, wherein the nucleic 
acid molecule encodes a polypeptide having an amino add 
sequence selected from among the group consisting of SEQ 
ID N0:2, SEQ ID N0:4, SEQ ID N0:9, SEQ ID NO;!!, 
SEQ ID N0:13, SEQ ID N0:15, and SEQ ID N0:17. 

45 6, A vector comprising in serial array a promoter, and the 
nucleic acid molecule of claim 5. 

7, A cell in culture transformed by the nucleic acid 
molecule of claim 5. 

8. A method for inducing differentiation of a non^neuronal 
so cell in culture into a neuronal cell in culture into a neuronal 

cell, comprising introducing the nucleic acid molecule of 
claim 5, into the non- neuronal cell. 

:f « 3|r ^ * 



